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A Corporate Dedication to 
Quality and Reliability 

National Semiconductor is an industry leader in the 
manufacture of high quality, high reliability integrated 
circuits. We have been the leading proponent of driv- 
ing down 1C defects and extending product lifetimes. 
From raw material through product design, manufac- 
turing and shipping, our quality and reliability is second 
to none. 

We are proud of our success ... it sets a standard for 
others to achieve. Yet, our quest for perfection is on- 
going so that you, our customer, can continue to rely 
on National Semiconductor Corporation to produce 
high quality products for your design systems. 



Charles E. Sporck 

President, Chief Executive Officer 

National Semiconductor Corporation 



Wir fuhlen uns zu Qualitat und 
Zuverlassigkeit verpflichtet 


Un Impegno Societario di Qualita e 
Affidabilita 


National Semiconductor Corporation ist fuhrend bei der Her- 
stellung von integrierten Schaltungen hoher Qualitat und 
hoher Zuverlassigkeit. National Semiconductor war schon 
immer Vorreiter, wenn es gait, die Zahl von 1C Ausfailen zu 
verringern und die Lebensdauern von Produkten zu verbes- 
sern. Vom Rohmaterial uber Entwurf und Herstellung bis zur 
Auslieferung, die Qualitat und die Zuverlassigkeit der Pro- 
duce von National Semiconductor sind unubertroffen. 

Wir sind stolz auf unseren Erfolg, der Standards setzt, die 
fiir andere erstrebenswert sind. Auch ihre Anspruche steig- 
en standig. Sie als unser Kunde kfinnen sich auch weiterhin 
auf National Semiconductor verlassen. 


La Qualite et La Fiabilit&: 

Une Vocation Commune Chez National 
Semiconductor Corporation 

National Semiconductor Corporation est un des leaders in- 
dustriels qui fabrique des circuits int£grds d’une tr£s grande 
qualita et d’une fiabilita exceptionelle. National a 6t6 le pre- 
mier a vouloir faire chuter le nombre de circuits int6gr6s 
d6fectueux et a augmenter la dur6e de vie des produits. 
Depuis les matures premieres, en passant par la concep- 
tion du produit sa fabrication et son expedition, partout la 
qualite et la fiabilite chez National sont sans equivalents. 
Nous sommes tiers de notre succds et le standard ainsi 
d6fini devrait devenir I’objectif a atteindre par les autres so- 
ci6t6s. Et nous continuons a vouloir faire progresser notre 
recherche de la perfection; il en resulte que vous, qui etes 
notre client, pouvez toujours faire confiance a National 
Semiconductor Corporation, en produisant des systames 
d’une trds grande qualite standard. 



Charles E. Sporck 

President, Chief Executive Officer 

National Semiconductor Corporation 


National Semiconductor Corporation § un’industria al ver- 
tice nella costruzione di circuiti integrati di alta qualita ed 
affidabilita. National a stata il principals promotore per I’ab- 
battimento della difettosita dei circuiti integrati e per I’allun- 
gamento della vita dei prodotti. Dal materials grezzo attra- 
verso tutte le fasi di progettazione, costruzione e spedi- 
zione, la qualita e affidabilita National non a seconds a nes- 
suno. 

Noi siamo orgogliosi del nostro successo che fissa per gli 
altri un traguardo da raggiungere. II nostro desiderio di per- 
fezione a d’altra parte illimitato e pertanto tu, nostro cliente, 
puoi continuare ad affidarti a National Semiconductor Cor- 
poration per la produzione dei tuoi sistemi con elevati livelli 
di qualita. 
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or circuit described herein; neither does it convey any license under its patent rights, nor the rights of others. 
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This data sheet contains the design specifications for product 
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Introduction 


National’s Embedded System Processor™ family offers the 
most complete solution to your 32-bit embedded processor 
needs via CPUs, slave processors, system peripherals, 
evaluation/development tools and software. 

We at National Semiconductor firmly believe that it takes a 
total family of Embedded System Processors to effectively 
meet the needs of an embedded system designer. 

This Databook presents technical descriptions of our 32-bit 
Embedded System Processors, slave processors, peripher- 
als, software and development tools. It is designed to be 
updated frequently so that our customers can have the lat- 
est technical information on the Embedded System Proces- 
sor. 

When we at National Semiconductor began designing the 
Embedded System Processor family, we decided to support 
an architecture that addressed the needs of embedded de- 
sign. We chose to take the time to design it properly so that 
optimal system cost/performance, high system integration, 
and total system solutions were addressed. Working from 
the top down, we analyzed the issues and anticipated the 
embedded computing needs. The result is an advanced and 
efficient family of Embedded System Processors. 

Software productivity has become a major issue in embed- 
ded system product development. In embedded systems 
this issue centers around the capability of the processor to 
maximize the utility of software relative to shorter develop- 
ment cycles, under the constraints of lower cost and higher 
performance. 


In short, the degree to which the processor can maximize 
software utility directly affects the cost of a product, its reli- 
ability, and time to market. It also affects future software 
modification for product enhancement or rapid advances in 
hardware technology. 

Our approach has been to define an architecture address- 
ing these software issues most effectively. National Semi- 
conductor’s Embedded System Processor family combines 
32-bit performance with efficient management of a large ad- 
dress space. It facilitates high-level language program de- 
velopment and efficient instruction execution. Floating-point 
is integrated into the architecture. 

But we didn’t stop there. Advanced architecture isn’t 
enough. Our total product system solution approach in- 
cludes the hardware, software, and development support 
products necessary for your design. The evaluation board, 
in-system emulator, software development tools, and third 
party software are available now for your evaluation and 
development. 

The Embedded System Processor is a solid foundation from 
which National Semiconductor can build solutions for your 
future designs while satisfying your needs today. 

For further information please contact your local sales of- 
fice. 
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Key Features of National’s 
Embedded System Processors™ 


Some of the features that set the Embedded System Proc- 
essor family apart as the best choice for 32-bit designs are 
as follows: 

FAMILY OF EMBEDDED SYSTEM PROCESSORS 

Embedded System Processors are more than just a single 
chip set, it is a family of chip sets. By mixing and matching 
CPUs with compatible slave processors and support chips, 
an embedded system designer has an unprecedented de- 
gree of flexibility in matching price/performance to the end 
product. 

CLEANEST 32-BIT OPTIMIZED ARCHITECTURE 

The Embedded System Processor was designed around a 
32-bit architecture from the beginning. It has a fully symmet- 
rical instruction set so that all addressing modes and all 
data types can be operated on by all instructions. This 
makes it easy to learn the architecture, easy to program in 
assembly language, and easy to write code-efficient, high- 
level language compilers. 

APPLICATION-SPECIFIC SLAVE PROCESSORS 

Embedded System Processor architecture allows users to 
design their own application-specific slave processors to 
interface with the existing chip set. These processors can 
be used to increase the overall system performance by 


accelerating customized CPU instructions that would other- 
wise be implemented in software. At the same time, soft- 
ware compatibility is maintained, i.e., it is always possible to 
substitute lower-cost software modules in place of the slave 
processor. 

FLOATING-POINT SUPPORT 

National offers a complete set of floating-point solutions. 
This includes the NS32081 Floating-Point Unit, and the 
NS32381 Floating-Point Unit. The NS32081 provides high- 
speed arithmetic computation with high precision and accu- 
racy at low cost. The NS32381 provides low power con- 
sumption and even greater performance than the NS32081 
while maintaining high-precision and accuracy. 

HIGH-LEVEL LANGUAGE SUPPORT 

National’s Embedded System Processor has special fea- 
tures that support high-level languages, thus improving soft- 
ware productivity and reducing development costs. For ex- 
ample, there are special instructions that help the compiler 
deal with structured data types such as Arrays, Strings, Rec- 
ords, and Stacks. Also, modular programming is supported 
by special hardware registers, software instructions, an ex- 
ternal addressing mode, and architecturally supported link 
tables. 
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Component Descriptions 

Device 

Description 

Bus Width 

Process 

Package 

Type 

Internal 

External 

Address 

Data 

CENTRAL PROCESSING UNITS (CPU’s) 

NS32GX32 

High-Performance 32-Bit Embedded System Processor 

32 

32 

32 

M 2 CMOS 

175-pin PGA 

NS32CG16 

High-Performance Printer/Display Processor 

32 

24 

16 

CMOS 

68-pin PCC 

SLAVE PROCESSORS 

NS32081 

Floating-Point Unit 

64 


16 

XMOS 

24-pin DIP 
Dual-In-Line 
Package 

NS32381 

Floating-Point Unit 

64 

— 

16 

CMOS 

68-pin PGA 

PERIPHERALS 

NS32202 

Interrupt Control Unit 

32 

" 

16 

XMOS 

(NMOS) 

40-pin DIP 
Dual-ln-Line 
Package 

NS32203 

Direct Memory Access Controller 

— 

— 

16 

XMOS 

(NMOS) 

48-pin DIP 
Dual-ln-Line 
Package 
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TL/XX/0164-1 
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Systems and Software Chart 


HOST 

BOARD LEVEL DEVELOPMENT 

PRODUCTS SOFTWARE EMULATORS ENVIRONMENTS 



TL/XX/0165-1 
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NS32GX32-20/NS32GX32-25/NS32GX32-30 
High-Performance 32-Bit Embedded System Processor 


General Description 


The NS32GX32 is a high-performance 32-bit embedded 
system processor in the Series 32000® family. It is software 
compatible with the previous microprocessors in the family 
but with a greatly enhanced internal implementation. 

The NS32GX32 integrates more than 320,000 transistors 
fabricated in a 1.25 ftm double-metal CMOS technology. 
The advanced technology and mainframe-like design of the 
device enable it to achieve peak performance of 15 million 
instructions per second. 

The high-performance specifications are the result of a four- 
stage instruction pipeline, on-chip instruction and data 
caches, and a significantly increased clock frequency. In ad- 
dition, the system interface provides optimal support for ap- 
plications spanning a wide range, from low-cost, real-time 
controllers to highly sophisticated, embedded systems. 

In addition to generally improved performance, the 
NS32GX32 offers much faster interrupt service and task 
switching for real-time applications. 


Features 

■ Software compatible with the Series 32000 family 

■ 32-bit architecture and implementation 

■ 4-GByte uniform addressing space 

■ 4-Stage instruction pipeline 

■ 512-Byte on-chip instruction cache 

■ 1024-Byte on-chip data cache 

■ High-performance bus 

— Separate 32-bit address and data lines 

— Burst mode memory accessing 

— Dynamic bus sizing 

■ Floating-point support via the NS32381 

■ 1.25 p.m double-metal CMOS technology 

■ 175-pin PGA package 
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1.0 Product Introduction 

The NS32GX32 is an extremely sophisticated microproces- 
sor in the Series 32000 family with a full 32-bit architecture 
and implementation optimized for high-performance appli- 
cations. 

By employing a number of mainframe-like features, the de- 
vice can deliver 15 MIPS peaks performance with no wait 
states at a frequency of 30 MHz. 

The NS32GX32 is fully software compatible will all the other 
Series 32000 CPUs. The architectural features of the Series 
32000 family and particularly the NS32GX32 CPU, are de- 
scribed briefly below. 

Powerful Addressing Modes. Nine addressing modes 
available to all instructions are included to access data 
structures efficiently. 

Data Types. The architecture provides for numerous data 
types, such as byte, word, doubleword, and BCD, which may 
be arranged into a wide variety of data structures. 
Symmetric Instruction Set. While avoiding special case 
instructions that compilers can’t use, the Series 32000 ar- 
chitecture incorporates powerful instructions for control op- 
erations, such as array indexing and external procedure 
calls, which save considerable space and time for compiled 
code. 

Memory-to-Memory Operations. The Series 32000 CPUs 
represent two-address machines. This means that each op- 
erand can be referenced by any one of the addressing 
modes provided. 

This powerful memory-to-memory architecture permits 
memory locations to be treated as registers for all usefull 
operations. This is important for temporary operands as well 
as for context switching. 


Address 



FIGURE 2-1. NS32G) 


Large, Uniform Addressing. The NS32GX32 has 32-bit 
address pointers that can address up to 4 gigabytes without 
requiring any segmentation. 

Modular Software Support. Any software package for the 
Series 32000 family can be developed independent of all 
other packages, without regard to individual addressing. In 
addition, ROM code is totally relocatable and easy to ac- 
cess, which allows a significant reduction in hardware and 
software costs. 

Software Processor Concept. The Series 32000 architec- 
ture allows future expansions of the instruction set that can 
be executed by special slave processors, acting as exten- 
sions to the CPU. This concept of slave processors is 
unique to the Series 32000 family. It allows software com- 
patibility even for future components because the slave 
hardware is transparent to the software. With future ad- 
vances in semiconductor technology, the slaves can be 
physically integrated on the CPU chip itself. 

To summarize, the architectural features cited above pro- 
vide three primary performance advantages and character- 
istics: 

• High-level language support 

• Easy future growth path 

• Application flexibility 

2.0 Architectural Description 

2.1 REGISTER SET 

The NS32GX32 CPU has 21 internal registers grouped ac- 
cording to functions as follows: 8 general purpose, 7 ad- 
dress, 1 processor status, 1 configuration, and 4 debug. All 
registers are 32 bits wide except for the module and proces- 
sor status, which are each 16 bits wide. Figure 2-1 shows 
the NS32GX32 internal registers. 


General Purpose 
<— 32 Bits — > 



!-8 





2.0 Architectural Description (Continued) 


2.1.1 General Purpose Registers 

There are eight registers (R0-R7) used for satisfying the 
high speed general storage requirements, such as holding 
temporary variables and addresses. The general purpose 
registers are free for any use by the programmer. They are 
32 bits in length. If a general purpose register is specified for 
an operand that is eight or 16 bits long, only the low part of 
the register is used; the high part is not referenced or modi- 
fied. 

2.1.2 Address Registers 

The seven address registers are used by the processor to 
implement specific address functions. A description of them 
follows. 

PC— Program Counter. The PC register is a pointer to the 
first byte of the instruction currently being executed. The PC 
is used to reference memory in the program section. 

SPO, SP1 — Stack Pointers. The SPO register points to the 
lowest address of the last item stored on the INTERRUPT 
STACK. This stack is normally used only by the operating 
system. It is used primarily for storing temporary data, and 
holding return information for operating system subroutines 
and interrupt and trap service routines. The SP1 register 
points to the lowest address of the last item stored on the 
USER STACK. This stack is used by normal user programs 
to hold temporary data and subroutine return information. 
When a reference is made to the selected Stack Pointer 
(see PSR S-bit), the terms ‘SP Register’ or ‘SP’ are used. 
SP refers to either SPO or SP1, depending on the setting of 
the S bit in the PSR register. If the S bit in the PSR is 0, SP 
refers to SPO. If the S bit in the PSR is 1 then SP refers to 
SP1. 

The NS32GX32 also allows the SP1 register to be directly 
loaded and stored using privileged forms of the LPRi and 
SPRi instructions, regardless of the setting of the PSR S-bit. 
When SP1 is accessed in this manner, it is referred to as 
‘USP Register’ or simply ‘USP’. 

Stacks in the Series 32000 family grow downward in memo- 
ry. A Push operation pre-decrements the Stack Pointer by 
the operand length. A Pop operation post-increments the 
Stack Pointer by the operand length. 

FP— Frame Pointer. The FP registpr is used by a procedure 
to access parameters and local variables on the stack. The 
FP register is set up on procedure entry with the ENTER 
instruction and restored on procedure termination with the 
EXIT instruction. 

The frame pointer holds the address in memory occupied by 
the old contents of the frame pointer. 

SB— Static Base. The SB register points to the global vari- 
ables of a software module. This register is used to support 
relocatable global variables for software modules. The SB 
register holds the lowest address in memory occupied by 
the global variables of a module. 


INTBASE — Interrupt Base. The INTBASE register holds 
the address of the dispatch table for interrupts and traps 
(Section 3.2.1). 

MOD — Module. The MOD register holds the address of the 
module descriptor of the currently executing software mod- 
ule. The MOD register is 16 bits long, therefore the modulo 
table must be contained within the first 64 kbytes of memo- 
ry- 

2.1.3 Processor Status Register 

The Processor Status Register (PSR) holds status informa- 
tion for the microprocessor. 

The PSR is sixteen bits long, divided into two eight-bit 
halves. The low order eight bits are accessible to all pro- 
grams, but the high order eight bits are accessible only to 
programs executing in Supervisor Mode. 

C The C bit indicates that a carry or borrow occurred after 
an addition or subtraction instruction. It can be used 
with the ADDC and SUBC instructions to perform multi- 
ple-precision integer arithmetic calculations. It may 
have a setting of 0 (no carry or borrow) or 1 (carry or 
borrow). 

T The T bit causes program tracing. If this bit is set to 1 , a 
TRC trap is executed after every instruction (Section 
3.3.1). 

L The L bit is altered by comparison instructions. In a 
comparison instruction the L bit is set to “1” if the sec- 
ond operand is less than the first operand, when both 
operands are interpreted as unsigned integers. Other- 
wise, it is set to “0”. In Floating-Point comparisons, this 
bit is always cleared. 

V The V-bit enables generation of a trap (OVF) when an 
integer arithmetic operation overflows. 

F The F bit is a general condition flag, which is altered by 
many instructions (e.g., integer arithmetic instructions 
use it to indicate overflow). 

Z The Z bit is altered by comparison instructions. In a 
comparison instruction the Z bit is set to “1” if the sec- 
ond operand is equal to the first operand; otherwise it is 
set to “0”. 

N The N bit is altered by comparison instructions. In a 
comparison instruction the N bit is set to “1 ” if the sec- 
ond operand is less than the first operand, when both 
operands are interpreted as signed integers. Otherwise, 
it is set to “0”. 

U If the U bit is “1 ” no privileged instructions may be exe- 
cuted. If the U bit is “0” then all instructions may be 
executed. When U = 0 the processor is said to be in 
Supervisor Mode; when U = 1 the processor is said to 


FIGURE 2-2. Processor Status Register (PSR) 
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2.0 Architectural Description (Continued) 

be in User Mode. A User Mode program is restricted 
from executing certain instructions and accessing cer- 
tain registers which could interfere with the operating 
system. For example, a User Mode program is prevent- 
ed from changing the setting of the flag used to indicate 
its own privilege mode. A Supervisor Mode program is 
assumed to be a trusted part of the operating system, 
hence it has no such restrictions. 

S The S bit specifies whether the SPO register or SP1 
register is used as the Stack Pointer. The bit is automat- 
ically cleared on interrupts and traps. It may have a 
setting of 0 (use the SPO register) or 1 (use the SP1 
register). 

P The P bit prevents a TRC trap from occuring more than 
once for an instruction (Section 3.3.1). It may have a 
setting of 0 (no trace pending) or 1 (trace pending). 

I If I = 1 , then all interrupts will be accepted. If I = 0, 
only the NMI interrupt is accepted. Trap enables are not 
affected by this bit. 

2.1.4 Configuration Register 

The Configuration Register (CFG) is 32 bits wide, of which 
ten bits are implemented. The implemented bits enable vari- 
ous operating modes for the CPU, including vectoring of 
interrupts, execution of slave instructions, and control of the 
on-chip caches. In the NS32332 bits 4 through 7 of the CFG 
register selected between the 16-bit and 32-bit slave proto- 
cols and between 512-byte and 4-Kbyte page sizes. The 
NS32GX32 supports only the 32-bit slave protocol and no 
memory management: consequently these bits are forced 
to 1. 

When the CFG register is loaded using the LPRi instruction, 
bit 2 and bits 13 through 31 should be set to 0. Bits 4 
through 7 are ignored during loading, and are always re- 
turned as 1 ’s when CFG is stored via the SPRi instruction. 
When the SETCFG instruction is executed, the contents of 
the CFG register bits 0 through 3 are loaded from the in- 
struction’s short field, bits 4 through 7 are ignored and bits 8 
through 12 are forced to 0. Bit 2 must be set to 0. 

The format of the CFG register is shown in Figure 2-3. The 
various control bits are described below. 


I Interrupt vectoring. This bit controls whether maska- 
ble interrupts are handled in nonvectored (1 = 0) or 
vectored (1 = 1) mode. Refer to Section 3.2.3 for more 
information. 

F Floating-point instruction set. This bit indicates 
whether a floating-point unit (FPU) is present to exe- 
cute floating-point instructions. If this bit is 0 when the 
CPU executes a floating-point instruction, a Trap 
(UND) occurs. If this bit is 1 , then the CPU transfers 
the instruction and any necessary operands to the 
FPU using the slave-processor protocol described in 
Section 3.1 .4.1. 

C Custom instruction set. This bit indicates whether a 
custom slave processor is present to execute custom 
instructions. If this bit is 0 when the CPU executes a 
custom instruction, a Trap (UND) occurs. If this bit is 
1 , the CPU transfers the instruction and any neces- 
sary operands to the custom slave processor using 
the slave-processor protocol described in Section 
3. 1.4.1. 

DE Direct-Exception mode enable. This bit enables the 
Direct-Exception mode for processing exceptions. 
When this mode is selected, the CPU response time 
to interrupts and other exceptions is significantly im- 
proved. Refer to Section 3.2.1 for more information. 

DC Data Cache enable. This bit enables the on-chip Data 
Cache to be accessed for data reads and writes. Re- 
fer to Section 3.4.2 for more information. 

LDC Lock Data Cache. This bit controls whether the con- 
tents of the on-chip Data Cache are locked to fixed 
memory locations (LDC= 1), or updated when a data 
read is missing from the cache (LDC=0). 

1C Instruction Cache enable. This bit enables the on- 
chip Instruction Cache to be accessed for instruction 
fetches. Refer to Section 3.4.1 for more information. 

LIC Lock Instruction Cache. This bit controls whether the 
contents of the on-chip Instruction Cache are locked 
to fixed memory locations (LIC= 1), or updated when 
an instruction fetch is missing from the cache 
(LIC = 0). 
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FIGURE 2-3. Configuration Register (CFG) Bits 13 to 31 are Reserved; Bits 4 to 7 are Forced to 1 
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2.0 Architectural Description (Continued) 

2.1.5 Debug Registers 

The NS32GX32 contains 4 registers dedicated for debug- 
ging functions. 

These registers are accessed using privileged forms of the 
LPRi and SPRi instructions. 

DCR — Debug Condition Register. The DCR Register en- 
ables detection of debug conditions. The format of the DCR 
is shown in Figure 2-4; the various bits are described below. 
A debug condition is enabled when the related bit is set to 1. 
CBEO Compare Byte Enable 0; when set, BYTEO of an 
aligned double-word is included in the address com- 
parison 

CBE1 Compare Byte Enable 1; when set, BYTE1 of an 
aligned double-word is included in the address com- 
parison 

CBE2 Compare Byte Enable 2; when set, BYTE2 of an 
aligned double-word is included in the address com- 
parison 

CBE3 Compare Byte Enable 3; when set, BYTE3 of an 
aligned double-word is included in the address com- 
parison 

CWR Address-compare enable for write references 
CRD Address-compare enable for read references 
CAE Address-compare enable 

TR Enable Trap (DBG) when a debug condition is de- 
tected 

PCE PC-match enable 
UD Enable debug conditions in User-Mode 
SD Enable debug conditions in Supervisor Mode 
DEN Enable debug conditions 


The following 2 bits control testing features that can be 
used during initial system debugging. These features are 
unique to the NS32GX32 implementation of the Series 
32000 architecture; as such, they may not be supported in 
future implementations. For normal operation these 2 bits 
should be set to 0. 

SI Single-Instruction mode enable. This bit, when set 
to 1, inhibits the overlapping of instruction’s execu- 
tion. 

BCP Branch Condition Prediction disable. When this bit is 
1 , the branch prediction mechanism is disabled. See 
Section 3.1. 3.1. 

DSR— Debug Status Register. The DSR Register indicates 
debug conditions that have been detected. When the CPU 
detects an enabled debug condition, it sets the correspond- 
ing bit (BC, BEX, BCA) in the DSR to 1. When an address- 
compare condition is detected, then the RD-bit is loaded to 
indicate whether a read or write reference was performed. 
Software must clear all the bits in the DSR when appropri- 
ate. The format of the DSR is shown in Figure 2-5; the vari- 
ous fields are described below. 

RD Indicates whether the last address-compare condi- 
tion was for a read (RD = 1) or write (RD = 0) 
reference 

BPC PC-match condition detected 
BEX External condition detected 
BCA Address-compare condition detected 

Note: If an address compare is detected for a read and write for the same 
Instruction, the RD bit will remain clear. 

CAR— Compare Address Register. The CAR Register 
contains the address that is compared to operand reference 
addresses to detect an address-compare condition. The ad- 
dress must be double-word aligned; that is, the two least- 
significant bits must be 0. The CAR is 32 bits wide. 
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FIGURE 2-4. Debug Condition Register (DCR) 
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FIGURE 2-5. Debug Status Register (DSR) 
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2.0 Architectural Description (Continued) 

BPC — Breakpoint Program Counter. The BPC Register 
contains the address that is compared with the PC contents 
to detect a PC-match condition. The BPC Register is 32 bits 
wide. 

2.2 MEMORY ORGANIZATION 

The NS32GX32 implements full 32-bit addresses. This al- 
lows the CPU to access up to 4 Gbytes of memory. The 
memory is a uniform linear address space. Memory loca- 
tions are numbered sequentially starting at zero and ending 
at 2 32 -1. The number specifying a memory location is 
called an address. The contents of each memory location is 
a byte consisting of eight bits. Unless otherwise noted, dia- 
grams in this document show data stored in memory with 
the lowest address on the right and the highest address on 
the left. Also, when data is shown vertically, the lowest ad- 
dress is at the top of a diagram and the highest address at 
the bottom of the diagram. When bits are numbered in a 
diagram, the least significant bit is given the number zero, 
and is shown at the right of the diagram. Bits are numbered 
in increasing significance and toward the left. 

_7 0 

A 


Byte at Address A 

Two contiguous bytes are called a word. Except where not- 
ed, the least significant byte of a word is stored at the lower 
address, and the most significant byte of the word is stored 
at the next higher address. In memory, the address of a 
word is the address of its least significant byte, and a word 
may start at any address. 
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Word at Address A 

Two contiguous words are called a double-word. Except 
where noted, the least significant word of a double-word is 


stored at the lowest address and the most significant word 
of the double-word is stored at the address two higher. In 
memory, the address of a double-word is the address of its 
least significant byte, and a double-word may start at any 
address. 
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Double-Word at Address A 

Although memory is addressed as bytes, it is actually orga- 
nized as double-words. Note that access time to a word or a 
double-word depends upon its address, e.g. double-words 
that are aligned to start at addresses that are multiples of 
four will be accessed more quickly than those not so 
aligned. This also applies to words that cross a double-word 
boundary. 

2.2.1 Address Mapping 

Figure 2-6 shows the NS32GX32 address mapping. 

The NS32GX32 supports the use of memory-mapped pe- 
ripheral devices and coprocessors. Such memory-mapped 
devices can be located at arbitrary locations in the address 
space except for the upper 8 Mbytes of memory (addresses 
between FF800000 (hex) and FFFFFFFF (hex), inclusive), 
which are reserved by National Semiconductor Corporation. 
Nevertheless, it is recommended that high-performance pe- 
ripheral devices and coprocessors be located in a specific 8 
Mbyte region of memory (addresses between FFOOOOOO 
(hex) and FF7FFFFF (hex), inclusive), that is dedicated for 
memory-mapped I/O. This is because the NS32GX32 de- 
tects references to the dedicated locations and serializes 
reads and writes. See Section 3. 1.3.3. When making I/O 
references to addresses outside the dedicated region, ex- 
ternal hardware must indicate to the NS32GX32 that special 
handling is required. 

In this case a small performance degradation will also re- 
sult. Refer to Section 3.1 .3.2 for more information on memo- 
ry-mapped I/O. 



Memory and I/O 


Memory-Mapped I/O 


Reserved by NSC 


Interrupt Control 


FIGURE 2-6. NS32GX32 Address Mapping 
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2.0 Architectural Description (Continued) 

2.3 MODULAR SOFTWARE SUPPORT 

The NS32GX32 provides special support for software mod- 
ules and modular programs. 

Each module in a NS32GX32 software environment con- 
sists of three components: 

1 . Program Code Segment. 

This segment contains the module’s code and constant 
data. 

2. Static Data Segment. 

Used to store variables and data that may be accessed 
by all procedures within the module. 

3. Link Table. 

This component contains two types of entries: Absolute 
Addresses and Procedure Descriptors. 

An Absolute Address is used in the external addressing 
mode, in conjunction with a displacement and the current 
MOD Register contents to compute the effective address 
of an external variable belonging to another module. 

The Procedure Descriptor is used in the call external pro- 
cedure (CXP) instruction to compute the address of an 
external procedure. 

Normally, the linker program specifies the locations of the 
three components. The Static Data and Link Table typically 
reside in RAM; the code component can be either in RAM or 
in ROM. The three components can be mapped into non- 
contiguous locations in memory, and each can be indepen- 
dently relocated. Since the Link Table contains the absolute 
addresses of external variables, the linker need not assign 
absolute memory addresses for these in the module itself; 
they may be assigned at load time. 

To handle the transfer of control from one module to anoth- 
er, the NS32GX32 uses a module table in memory and two 
registers in the CPU. 


The Module Table is located within the first 64 kbytes of 
memory. This table contains a Module Descriptor (also 
called a Module Table Entry) for each module in the ad- 
dress space of the program. A Module Descriptor has four 
32-bit entries corresponding to each component of a mod- 
ule: 

• The Static Base entry contains the address of the begin- 
ning of the module’s static data segment. 

• The Link Table Base points to the beginning of the mod- 
ule’s Link Table. 

• The Program Base is the address of the beginning of the 
code and constant data for the module. 

• A fourth entry is currently unused but reserved. 

The MOD Register in the CPU contains the address of the 
Module Descriptor for the currently executing module. 

The Static Base Register (SB) contains a copy of the Static 
Base entry in the Module Descriptor of the currently execut- 
ing module, i.e., it points to the beginning of the current 
module’s static data area. 

This register is implemented in the CPU for efficiency pur- 
poses. By having a copy of the static base entry or chip, the 
CPU can avoid reading it from memory each time a data 
item in the static data segment is accessed. 

In an NS32GX32 software environment modules need not 
be linked together prior to loading. As modules are loaded, 
a linking loader simply updates the Module Table and fills 
the Link Table entries with the appropriate values. No modi- 
fication of a module’s code is required. Thus, modules may 
be stored in read-only memory and may be added to a sys- 
tem independently of each other, without regard to their in- 
dividual addressing. Figure 2-7 shows a typical NS32GX32 
run-time environment. 


MODULE 

TABLE 

ENTRY 



Note: Dashed lines indicate information copied to registers during transfer of control between modules. 
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FIGURE 2-7. NS32GX32 Run-Time Environment 
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2.0 Architectural Description (Continued) 


OPTIONAL 

EXTENSIONS 


BASIC 

INSTRUCTION 
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FIGURE 2-8. General Instruction Format 
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2.4 INSTRUCTION SET 


FIGURE 2-9. Index Byte Format 

Byte Displacement: Range -64 to +63 


2.4.1 General Instruction Format 

Figure 2-8 shows the general format of a Series 32000 in- 
struction. The Basic Instruction is one to three bytes long 
and contains the Opcode and up to two 5-bit General Ad- 
dressing Mode (“Gen”) fields. Following the Basic Instruc- 
tion field is a set of optional extensions, which may appear 
depending on the instruction and the addressing modes se- 
lected. 

Index Bytes appear when either or both Gen fields specify 
Scaled Index. In this case, the Gen field specifies only the 
Scale Factor (1, 2, 4 or 8), and the Index Byte specifies 
which General Purpose Register to use as the index, and 
which addressing mode calculation to perform before index- 
ing. See Figure 2-9. 

Following Index Bytes come any displacements (addressing 
constants) or immediate values associated with the select- 
ed addressing modes. Each Disp/lmm field may contain 
one or two displacements, or one immediate value. The size 
of a Displacement field is encoded with the top bits of that 
field, as shown in Figure 2-10, with the remaining bits inter- 
preted as a signed (two’s complement) value. The size of an 
immediate value is determined from the Opcode field. Both 
Displacement and Immediate fields are stored most signifi- 
cant byte first. Note that this is different from the memory 
representation of data (Section 2.2). 

Some instructions require additional, ‘implied” immediates 
and/or displacements, apart from those associated with ad- 
dressing modes. Any such extensions appear at the end of 
the instruction, in the order that they appear within the list of 
operands in the instruction definition (Section 2.4.3). 

2.4.2 Addressing Modes 

The CPU generally accesses an operand by calculating its 
Effective Address based on information available when the 
operand is to be accessed. The method to be used in per- 
forming this calculation is specified by the programmer as 
an “addressing mode.” 


SIGNED DISPLACEMENT 


Word Displacement: Range -8192 to +8191 



Double Word Displacement: 
Range -(229 - 224) to + (229 - i)» 



TL/EE/10253-7 

FIGURE 2-10. Displacement Encodings 

’Note: The pattern "111 00000” for the most significant byte of the displace- 
ment is reserved by National for future enhancements. Therefore, it 
should never be used by the user program. This causes the lower 
limit of the displacement range to be — (2 29 — 2 24 ) instead of -2 29 . 
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2.0 Architectural Description (Continued) 

Addressing modes are designed to optimally support high- 
level language accesses to variables. In nearly all cases, a 
variable access requires only one addressing mode, within 
the instruction that acts upon that variable. Extraneous data 
movement is therefore minimized. 

Addressing Modes fall into nine basic types: 

Register: The operand is available in one of the eight Gen- 
eral Purpose Registers. In certain Slave Processor instruc- 
tions, an auxiliary set of eight registers may be referenced 
instead. 

Register Relative: A General Purpose Register contains an 
address to which is added a displacement value from the 
instruction, yielding the Effective Address of the operand in 
memory. 

Memory Space: Identical to Register Relative above, ex- 
cept that the register used is one of the dedicated registers 
PC, SP, SB or FP. These registers point to data areas gen- 
erally needed by high-level languages. 

Memory Relative: A pointer variable is found within the 
memory space pointed to by the SP, SB or FP register. A 
displacement is added to that pointer to generate the Effec- 
tive Address of the operand. 

Immediate: The operand is encoded within the instruction. 
This addressing mode is not allowed if the operand is to be 
written. 

Absolute: The address of the operand is specified by a 
displacement field in the instruction. 

External: A pointer value is read from a specified entry of 
the current Link Table. To this pointer value is added a dis- 
placement, yielding the Effective Address of the operand. 
Top of Stack: The currently-selected Stack Pointer (SPO or 
SP1) specifies the location of the operand. The operand is 
pushed or popped, depending on whether it is written or 
read. 

Scaled Index: Although encoded as an addressing mode, 
Scaled Indexing is an option on any addressing mode ex- 
cept Immediate or another Scaled Index. It has the effect of 
calculating an Effective Address, then multiplying any Gen- 


eral Purpose Register by 1, 2, 4 or 8 and adding it into the 
total, yielding the final Effective Address of the operand. 
Table 2-2 is a brief summary of the addressing modes. For a 
complete description of their actions, see the Instruction Set 
Reference Manual. 

2.4.3 Instruction Set Summary 

Table 2-3 presents a brief description of the NS32GX32 in- 
struction set. The Format column refers to the Instruction 
Format tables (Appendix A). The Instruction column gives 
the instruction as coded in assembly language, and the De- 
scription column provides a short description of the function 
provided by that instruction. Further details of the exact op- 
erations performed by each instruction may be found in the 
Instruction Set Reference Manual. 

Notations: 

i = Integer length suffix: B = Byte 
W = Word 
D = Double Word 

f = Floating Point length suffix: F = Standard Floating 
L = Long Floating 

gen = General operand. Any addressing mode can be 
specified. 

short = A 4-bit value encoded within the Basic Instruction 
(see Appendix A for encodings). 

imm = Implied immediate operand. An 8-bit value append- 
ed after any addressing extensions, 
disp = Displacement (addressing constant): 8, 16 or 32 
bits. All three lengths legal, 
reg = Any General Purpose Register: R0-R7. 
areg = Any Processor Register: Address, Debug, Status, 
Configuration. 

creg = A Custom Slave Processor Register (Implementa- 
tion Dependent). 

cond = Any condition code, encoded as a 4-bit field within 
the Basic Instruction (see Appendix A for encodings). 
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2.0 Architectural Description (continued) 



TABLE 2-2. NS32GX32 Addressing Modes 

ENCODING 

MODE 

ASSEMBLER SYNTAX 

EFFECTIVE ADDRESS 

Register 

00000 

Register 0 

RO, FO, LO 

None: Operand is in the 

00001 

Register 1 

R1,F1,L1 

specified register. 

00010 

Register 2 

R2, F2, L2 


00011 

Register 3 

R3, F3, L3 


00100 

Register 4 

R4, F4, L4 


00101 

Register 5 

R5, F5, L5 


00110 

Register 6 

R6, F6, L6 


00111 

Register 7 

R7, F7, L7 


Register Relative 

01000 

Register 0 relative 

disp(RO) 

Disp + Register. 

01001 

Register 1 relative 

disp(RI) 


01010 

Register 2 relative 

disp(R2) 


01011 

Register 3 relative 

disp(R3) 


01100 

Register 4 relative 

disp(R4) 


01101 

Register 5 relative 

disp(R5) 


OHIO 

Register 6 relative 

disp(R6) 


01111 

Register 7 relative 

disp(R7) 


Memory Relative 

10000 

Frame memory relative 

disp2(disp1 (FP)) 

Disp2 + Pointer; Pointer found at 

10001 

Stack memory relative 

disp2(disp1 (SP)) 

address Displ + Register. "SP” is either 

10010 

Static memory relative 

disp2(disp1(SB)) 

SPO or SP1 , as selected in PSR. 

Reserved 

10011 

(Reserved for Future Use) 



Immediate 

10100 

Immediate 

value 

None. Operand is input from 
instruction queue. 

Absolute 

10101 

Absolute 

@disp 

Disp. 

External 

10110 

External 

EXT(displ) + disp2 

Disp2 + Pointer; Pointer is found 
at Link Table Entry number Displ . 

Top of Stack 

10111 

Top of stack 

TOS 

Top of current stack, using either 

User or Interrupt Stack Pointer, 
as selected in PSR. Automatic 

Push/Pop included. 

Memory Space 

11000 

Frame memory 

disp(FP) 

Disp + Register; “SP” is either 

11001 

Stack memory 

disp(SP) 

SPO or SP1 , as selected in PSR. 

11010 

Static memory 

disp(SB) 


11011 

Program memory 

* + disp 


Scaled Index 

11100 

Index, bytes 

mode[Rn:B] 

EA (mode) + Rn. 

11101 

Index, words 

mode[Rn:W] 

EA (mode) + 2 x Rn. 

11110 

Index, double words 

mode[Rn:D] 

EA (mode) + 4 x Rn. 

11111 

Index, quad words 

mode[Rn:Q] 

EA (mode) + 8 x Rn. 

“Mode’ and ’n’ are contained 
within the Index Byte. 

EA (mode) denotes the effective 
address generated using mode. 
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2.0 Architectural Description (continued) 



TABLE 2-3. NS32GX32 Instruction Set Summary 

MOVES 




Format 

Operation 

Operands 

Description 

4 

MOVi 

gen.gen 

Move a value. 

2 

MOVQi 

short.gen 

Extend and move a signed 4-bit constant. 

7 

MOVMi 

gen.gen, disp 

Move Multiple: disp bytes (1 to 16). 

7 

MOVZBW 

gen.gen 

Move with zero extension. 

7 

MOVZiD 

gen.gen 

Move with zero extension. 

7 

MOVXBW 

gen.gen 

Move with sign extension. 

7 

MOVXiD 

gen.gen 

Move with sign extension. 

4 

ADDR 

gen.gen 

Move Effective Address. 

INTEGER ARITHMETIC 



Format 

Operation 

Operands 

Description 

4 

ADDI 

gen.gen 

Add. 

2 

ADDQi 

short, gen 

Add signed 4-bit constant. 

4 

ADDCi 

gen.gen 

Add with carry. 

4 

SUBi 

gen.gen 

Subtract. 

4 

SUBCi 

gen.gen 

Subtract with carry (borrow). 

6 

NEGi 

gen.gen 

Negate (2’s complement). 

6 

ABSi 

gen.gen 

Take absolute value. 

7 

MULi 

gen.gen 

Multiply. 

7 

QUOi 

gen.gen 

Divide, rounding toward zero. 

7 

REMi 

gen.gen 

Remainder from QUO. 

7 

DIVi 

gen.gen 

Divide, rounding down. 

7 

MODi 

gen.gen 

Remainder from DIV (Modulus). 

7 

MEIi 

gen.gen 

Multiply to Extended Integer. 

7 

DEIi 

gen.gen 

Divide Extended Integer. 

PACKED DECIMAL (BCD) ARITHMETIC 


Format 

Operation 

Operands 

Description 

6 

ADDPi 

gen.gen 

Add Packed. 

6 

SUBPi 

gen.gen 

Subtract Packed. 

INTEGER COMPARISON 



Format 

Operation 

Operands 

Description 

4 

CMPi 

gen.gen 

Compare. 

2 

CMPQi 

short.gen 

Compare to signed 4-bit constant. 

7 

CMPMi 

gen.gen.disp 

Compare Multiple: disp bytes (1 to 16). 

LOGICAL AND BOOLEAN 



Format 

Operation 

Operands 

Description 

4 

ANDi 

gen.gen 

Logical AND. 

4 

ORi 

gen.gen 

Logical OR. 

4 

BICi 

gen.gen 

Clear selected bits. 

4 

XORi 

gen.gen 

Logical Exclusive OR. 

6 

COMi 

gen.gen 

Complement ail bits. 

6 

NOTi 

gen.gen 

Boolean complement: LSB only. 

2 

Scondi 

gen 

Save condition code (cond) as a Boolean variable of size i. 

SHIFTS 




Format 

Operation 

Operands 

Description 

6 

LSHi 

gen.gen 

Logical Shift, left or right. 

6 

ASHi 

gen.gen 

Arithmetic Shift, left or right. 

6 

ROTi 

gen.gen 

Rotate, left or right. 


2-17 


NS32GX32-20/NS32GX32-25/NS32GX32-30 



NS32GX32-20/NS32GX32-25/NS32GX32-30 


2.0 Architectural Description (continued) 



TABLE 2-3. NS32GX32 Instruction Set Summary (Continued) 

BITS 




Format 

Operation 

Operands 

Description 

4 

TBITi 

gen.gen 

Test bit. 

6 

SBITi 

gen.gen 

Test and set bit. 

6 

SBITIi 

gen.gen 

Test and set bit, interlocked. 

6 

CBITi 

gen.gen 

Test and clear bit. 

6 

CBITIi 

gen.gen 

Test and clear bit, interlocked. 

6 

IBITi 

gen.gen 

Test and invert bit. 

8 

FFSi 

gen.gen 

Find first set bit. 

BIT FIELDS 



Bit fields are values in memory that are not aligned to byte boundaries. Examples are PACKED arrays and records used in 

Pascal. “Extract” instructions read and align a bit field. "Insert" instructions write a bit field from an aligned source. 

Format 

Operation 

Operands 

Description 

8 

EXTi 

reg.gen.gen.disp 

Extract bit field (array oriented). 

8 

INSi 

reg.gen.gen.disp 

Insert bit field (array oriented). 

7 

EXTSi 

gen.gen.imm.imm 

Extract bit field (short form). 

7 

INSSi 

gen.gen.imm.imm 

Insert bit field (short form). 

8 

CVTP 

reg, gen.gen 

Convert to Bit Field Pointer. 

ARRAYS 




Format 

Operation 

Operands 

Description 

8 

CHECKi 

reg, gen, gen 

Index bounds check. 

8 

INDEXi 

reg, gen, gen 

Recursive indexing step for multiple-dimensional arrays. 

STRINGS 




String instructions assign specific functions to 

Options on all string instructions are: 

the General Purpose Registers: 

B (Backward): Decrement string pointers after each step 

R4 - Comparison Value 


rather than incrementing. 

R3 - Translation Table Pointer 

U (Until match): End instruction if String 1 entry 

R2 - String 2 Pointer 


matches R4. 

R1 - String 1 Pointer 


W (While match): End instruction if String 1 entry 

R0 - Limit Count 


does not match R4. 




All string instructions end when R0 decrements to zero. 

Format 

Operation 

Operands 

Description 

5 

MOVSi 

options 

Move String 1 to String 2. 


MOVST 

options 

Move string, translating bytes. 

5 

CMPSi 

options 

Compare String 1 to String 2. 


CMPST 

options 

Compare translating, String 1 bytes. 

5 

SKPSi 

options 

Skip over String 1 entries. 


SKPST 

options 

Skip, translating bytes for Until/While. 
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2.0 Architectural Description (continued) 

TABLE 2*3. NS32GX32 Instruction Set Summary (Continued) 

JUMPS AND LINKAGE 

Format 

Operation 

Operands 

Description 

3 

JUMP 

gen 

Jump. 

0 

BR 

disp 

Branch (PC Relative). 

0 

Bcond 

disp 

Conditional branch. 

3 

CASEi 

gen 

Multiway branch. 

2 

ACBi 

short, gen, disp 

Add 4-bit constant and branch if non-zero. 

3 

JSR 

gen 

Jump to subroutine. 

1 

BSR 

disp 

Branch to subroutine. 

1 

CXP 

disp 

Call external procedure. 

3 

CXPD 

gen 

Call external procedure using descriptor. 

1 

SVC 


Supervisor Call. 

1 

FLAG 


Flag Trap. 

1 

BPT 


Breakpoint Trap. 

1 

ENTER 

[reg list] ,disp 

Save registers and allocate stack frame (Enter Procedure). 

1 

EXIT 

[reg list] 

Restore registers and reclaim stack frame (Exit Procedure). 

1 

RET 

disp 

Return from subroutine. 

1 

RXP 

disp 

Return from external procedure call. 

1 

RETT 

disp 

Return from trap. (Privileged) 

1 

RETI 


Return from interrupt. (Privileged) 

CPU REGISTER MANIPULATION 


Format 

Operation 

Operands 

Description 

1 

SAVE 

[reg list] 

Save General Purpose Registers. 

1 

RESTORE 

[reg list] 

Restore General Purpose Registers. 

2 

LPRi 

areg.gen 

Load Processor Register. (Privileged if PSR, INTBASE, USP, CFG 
or Debug Registers). 

2 

SPRi 

areg.gen 

Store Processor Register. (Privileged if PSR, INTBASE, USP, CFG 
or Debug Registers). 

3 

ADJSPi 

gen 

Adjust Stack Pointer. 

3 

BISPSRi 

gen 

Set selected bits in PSR. (Privileged if not Byte length) 

3 

BICPSRi 

gen 

Clear selected bits in PSR. (Privileged if not Byte length) 

5 SETCFG 

FLOATING POINT 

[option list] 

Set Configuration Register. (Privileged) 

Format 

Operation 

Operands 

Description 

11 

MOVf 

gen, gen 

Move a Floating Point value. 

9 

MOVLF 

gen, gen 

Move and shorten a Long value to Standard. 

9 

MOVFL 

gen, gen 

Move and lengthen a Standard value to Long. 

9 

MOVif 

gen, gen 

Convert any integer to Standard or Long Floating. 

9 

ROUNDfi 

gen, gen 

Convert to integer by rounding. 

9 

TRUNCfi 

gen, gen 

Convert to integer by truncating, toward zero. 

9 

FLOORfi 

gen, gen 

Convert to largest integer less than or equal to value. 

11 

ADDf 

gen, gen 

Add. 

11 

SUBf 

gen, gen 

Subtract. 

11 

MULf 

gen, gen 

Multiply. 

11 

DIVf 

gen, gen 

Divide. 

11 

CMPf 

gen, gen 

Compare. 

11 

NEGf 

gen, gen 

Negate. 

11 

ABSf 

gen, gen 

Take absolute value. 

12 

POLYf 

gen, gen 

Polynomial Step. 

12 

DOTf 

gen, gen 

Dot Product. 

12 

SCALBf 

gen, gen 

Binary Scale. 

12 

LOGBf 

gen, gen 

Binary Log. 

9 

LFSR 

gen 

Load FSR. 

9 

SFSR 

gen 

Store FSR. 
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2.0 Architectural Description (continued) 

MISCELLANEOUS 

TABLE 2-3. NS32GX32 Instruction Set Summary (Continued) 

Format 

Operation 

Operands 

Description 

1 

NOP 


No Operation. 

1 

WAIT 


Wait for interrupt. 

1 

DIA 


Diagnose. Single-byte “Branch to Self” for hardware 
breakpointing. Not for use in programming. 

14 

CINV 

options, gen 

Cache Invalidate. (Privileged) 

8 

MOVSUi 

gen, gen 

Move a value from Supervisor 

Space to User Space. (Privileged) 

8 MOVUSi 

CUSTOM SLAVE 

gen, gen 

Move a value from User Space 
to Supervisor Space. (Privileged) 

Format 

Operation 

Operands 

Description 

15.5 

CCALOc 

gen, gen 

Custom Calculate. 

15.5 

CCALIc 

gen, gen 


15.5 

CCAL2c 

gen, gen 


15.5 

CCAL3c 

gen, gen 


15.5 

CMOVOc 

gen, gen 

Custom Move. 

15.5 

CMOVIc 

gen, gen 


15.5 

CMOV2c 

gen, gen 


15.5 

CMOV3c 

gen, gen 


15.5 

CCMPOc 

gen, gen 

Custom Compare. 

15.5 

CCMPIc 

gen, gen 


15.1 

CCVOci 

gen, gen 

Custom Convert. 

15.1 

CCVIci 

gen, gen 


15.1 

CCV2ci 

gen, gen 


15.1 

CCV3ic 

gen, gen 


15.1 

CCV4DQ 

gen, gen 


15.1 

CCV5QD 

gen, gen 


15.1 

LCSR 

gen 

Load Custom Status Register. 

15.1 

SCSR 

gen 

Store Custom Status Register. 

15.0 

LCR 

creg.gen 

Load Custom Register. (Privileged) 

15.0 

SCR 

creg.gen 

Store Custom Register. (Privileged) 
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RST ACTIVE 


3.0 Functional Description 

This chapter provides details on the functional 
tics of the NS32GX32 microprocessor. 

The chapter is divided into five main sections: 

Instruction Execution, Exception Processing, Debugging, 
On-Chip Caches and System Interface. 

3.1 INSTRUCTION EXECUTION 

To execute an instruction, the NS32GX32 performs the fol- 
lowing operations: 

• Fetch the instruction 

• Read source operands, if any (1 ) 

• Calculate results 

• Write result operands, if any 

• Modify flags, if necessary 

• Update the program counter 

Under most circumstances, the CPU can be conceived to 
execute instructions by completing the operations above in 
strict sequence for one instruction and then beginning the 
sequence of operations for the next instruction. However, 
due to the internal instruction pipelining, as well as the oc- 
currence of exceptions, the sequence of operations per- 
formed during the execution of an instruction may be al- 
tered. Furthermore, exceptions also break the sequentiality 
of the instructions executed by the CPU. 

Details on the effects of the internal pipelining, as well as 
the occurrence of exceptions on the instruction execution, 
are provided in the following sections. 

Note: 1 In this and following sections, memory locations read by the CPU to 
calculate effective addresses for Memory-Relative and External ad- 
dressing modes are considered like source operands, even if the 
effective address is being calculated for an operand with access 
class of write. 

3.1.1 Operating States 

The CPU has five operating states regarding the execution 
of instructions and the processing of exceptions: Reset, Ex- 
ecuting Instructions, Processing An Exception, Waiting-For- 
An-lnterrupt, and Halted. The various states and transitions 
between them are shown in Figure 3- 1. 

Whenever the RST signal is asserted, the CPU enters the 
rese t state. The CPU remains in the reset state until the 
RST signal is driven inactive, at which time it enters the 
Executing-Instructions state. In the Reset state the contents 
of certain registers are initialized. Refer to Section 3.5.3 for 
details. 

In the Executing-Instructions state, the CPU executes in- 
structions. It will exit this state when an exception is recog- 
nized or a WAIT instruction is encountered. At which time it 
enters the Processing-An-Exception state or the Waiting- 
For-An-Interrupt state respectively. 

While in the Processing-An-Exception state, the CPU saves 
the PC, PSR and MOD register contents on the stack and 
reads the new PC and module linkage information to begin 
execution of the exception service procedure (see note). 
Following the completion of all data references required to 
process an exception, the CPU enters the Executing-In- 
structions state. 

In the Waiting-For-An-Interrupt state, the CPU is idle. A spe- 
cial status identifying this state is presented on the system 
interface (Section 3.5). When an interrupt or a debug condi- 



tion is detected, the CPU enters the Processing-An-Excep- 
tion state. 

The CPU enters the Halted state when a bus error is detect- 
ed while the CPU is processing an exception, thereby pre- 
venting the transfer of control to an appropriate exception 
service procedure. The CPU remains in the Halted state 
until reset occurs. A special status identifying this state is 
presented on the system interface. 

Note: When the Direct-Exception mode is enabled, the CPU does not save 
the MOD Register contents nor does it read the module linkage infor- 
mation for the exception service procedure. Refer to Section 3.2 for 
details. 

3.1.2 Instruction Endings 

The NS32GX32 checks for exceptions at various points 
while executing instructions. Certain exceptions, like inter- 
rupts, are in most cases recognized between instructions. 
Other exceptions, like Divide-By-Zero Trap, are recognized 
during execution of an instruction. When an exception is 
recognized during execution of an instruction, the instruction 
ends in one of four possible ways: completed, suspended, 
terminated, or partially completed. Each type of exception 
causes a particular ending, as specified in Section 3.2. 

3.1. 2.1 Completed Instructions 

When an exception is recognized after an instruction is 
completed, the CPU has performed all of the operations for 
that instruction and for all other instructions executed since 
the last exception occurred. Result operands have been 
written, flags have been modified, and the PC saved on the 
Interrupt Stack contains the address of the next instruction 
to execute. The exception service procedure can, at its con- 
clusion, execute the RETT instruction (or the RETI instruc- 
tion for vectored interrupts), and the CPU will begin execut- 
ing the instruction following the completed instruction. 
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3.0 Functional Description (Continued) 

3. 1.2.2 Suspended Instructions 

An instruction is suspended when one of several trap condi- 
tions or a restartable bus error is detected during execution 
of the instruction. A suspended instruction has not been 
completed, but all other instructions executed since the last 
exception occurred have been completed. Result operands 
and flags due to be affected by the instruction may have 
been modified, but only modifications that allow the instruc- 
tion to be executed again and completed can occur. For 
certain exceptions (Trap (UND), Trap (ILL), and bus errors) 
the CPU clears the P-flag in the PSR before saving the copy 
that is pushed on the Interrupt Stack. The PC saved on the 
Interrupt Stack contains the address of the suspended in- 
struction. 

For example, the RESTORE instruction pops up to 8 gener- 
al-purpose registers from the stack. If an invalid page table 
entry is detected on one of the references to the stack, then 
the instruction is suspended. The general-purpose registers 
due to be loaded by the instruction may have been modified, 
but the stack pointer still holds the same value that it did 
when the instruction began. 

To complete a suspended instruction, the exception service 
procedure takes either of two actions: 

1. The service procedure can simulate the suspended in- 
struction’s execution. After calculating and writing the in- 
struction’s results, the flags in the PSR copy saved on the 
Interrupt Stack should be modified, and the PC saved on 
the Interrupt Stack should be updated to point to the next 
instruction to execute. The service procedure can then 
execute the RETT instruction, and the CPU begins exe- 
cuting the instruction following the suspended instruction. 
This is the action taken when floating-point instructions 
are simulated by software in systems without a hardware 
floating-point unit. 

2. The suspended instruction can be executed again after 
the service procedure has eliminated the trap condition 
that caused the instruction to be suspended. The service 
procedure should execute the RETT instruction at its con- 
clusion; then the CPU begins executing the suspended 
instruction again. This is the action taken by a debugger 
when it encounters a BPT instruction that was temporarily 
placed in another instruction's location in order to set a 
breakpoint. 

Note 1: Although the NS32GX32 allows a suspended instruction to be exe- 
cuted again and completed, the CPU may have read a source oper- 
and for the instruction from a memory-mapped peripheral port be- 
fore the exception was recognized. In such a case, the characteris- 
tics of the peripheral device may prevent correct reexecution of the 
instruction. 

Note 2: It may be necessary for the exception service procedure to alter the 
P-flag in the PSR copy saved on the Interrupt Stack: If the exception 
service procedure simulates the suspended instruction and the P- 
flag was cleared by the CPU before saving the PSR copy, then the 
saved T-flag must be copied to the saved P-flag (like the floating- 
point instruction simulation described above). Or if the exception 
service procedure executes the suspended instruction again and 
the P-flag was not cleared by the CPU before saving the PSR copy, 
then the saved P-flag must be cleared (like the breakpoint trap de- 
scribed above). Otherwise, no alteration to the saved P-flag is nec- 
essary. 

3.1.2.3 Terminated Instructions 

An instruction being executed is terminated when reset or a 
nonrestartable bus error occurs. Any result operands and 
flags due to be affected by the instruction are undefined, as 


is the contents of the PC. The result operands of other in- 
structions executed since the last serializing operation may 
not have been written to memory. A terminated instruction 
cannot be completed. 

3.1. 2.4 Partially Completed Instructions 

When a restartable bus error, interrupt, or debug condition is 
recognized during execution of a string instruction, the in- 
struction is said to be partially completed. A partially com- 
pleted instruction has not completed, but all other instruc- 
tions executed since the last exception occurred have been 
completed. Result operands and flags due to be affected by 
the instruction may have been modified, but the values 
stored in the string pointers and other general-purpose reg- 
isters used during the instruction's execution allow the in- 
struction to be executed again and completed. 

The CPU clears the P-flag in the PSR before saving the 
copy that is pushed on the Interrupt Stack. The PC saved on 
the Interrupt Stack contains the address of the partially 
completed instruction. The exception service procedure 
can, at its conclusion, simply execute the RETT instruction 
(or the RETI instruction for vectored interrupts), and the 
CPU will resume executing the partially completed instruc- 
tion. 

3.1.3 Instruction Pipeline 

The NS32GX32 executes instructions in a heavily pipelined 
fashion. This allows a significant performance enhancement 
since the operations of several instructions are performed 
simultaneously rather than in a strictly sequential manner. 
The CPU provides a four-stage internal instruction pipeline. 
As shown in Figure 3-2, a write buffer, that can hold up to 
two operands, is also provided to allow write operations to 
be performed off-line. 



8 Byte Queue { Buffer 



{ 1 Decoded Instruction { Buffer 



Calculate Addresses Stage 3 
Read Source Operands 


Calculate Results Stage 4 

Write Destination Operands 


2 Memory Results J Buffer 
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FIGURE 3-2. NS32GX32 Internal Instruction Pipeline 

Due to the pipelining, operations like fetching one instruc- 
tion, reading the source operands of a second instruction, 
calculating the results of a third instruction and storing the 
results of a fourth instruction, can all occur in parallel. 


2-22 





3.0 Functional Description (Continued) 

The order of memory references performed by the CPU may 
also differ from that related to a strictly sequential instruc- 
tion execution. In fact, when an instruction is being execut- 
ed, some of the source operands may be read from memory 
before the instruction is completely fetched. For example, 
the CPU may read the first source operand for an instruction 
before it has fetched a displacement used in calculating the 
address of the second source operand. The CPU, however, 
always completes fetching an instruction and reading its 
source operands before writing its results. When more than 
one source operand must be read from memory to execute 
an instruction, the operands may be read in any order. Simi- 
larly, when more than one result operand is written to mem- 
ory to execute an instruction, the operands may be written 
in any order. 

An instruction is fetched only after all previous instructions 
have been completely fetched. However, the CPU may be- 
gin fetching an instruction before all of the source operands 
have been read and results written for previous instructions. 
The source operands for an instruction are read only after 
all previous instructions have been fetched and their source 
operands read. A source operand for an instruction may be 
read before all results of previous instructions have been 
written, except when the source operand’s value depends 
on a result not yet written. The CPU compares the address 
and length of a source operand with those of any results not 
yet written, and delays reading the source operand until af- 
ter writing all results on which the source operand depends. 
Also, the CPU ensures that the interlocked read and write 
references to execute an SBITIi or CBITIi instruction occur 
after writing all results of previous instructions and before 
reading any source operands for subsequent instructions. 
The result operands for an instruction are written after all 
results of previous instructions have been written. 

The description above is summarized in Figure 3-3, which 
shows the precedence of memory references for two con- 
secutive instructions. 


INSTRUCTION N 
INSTRUCTION FETCH : 


INSTRUCTION N+1 
• INSTRUCTION FETCH 
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FIGURE 3-3. Memory References for 
Consecutive Instructions 

(An arrow from one reference to another Indicates that 
the first reference always precedes the second.) 

Another consequence of overlapping the operations for sev- 
eral instructions, is that the CPU may fetch an instruction 
and read its source operands, even though the instruction is 
not executed (e.g., due to the occurrence of an exception). 
Special care is needed in the handling of memory-mapped 
I/O devices. The CPU provides special mechanisms to en- 
sure that the references to these devices are always per- 
formed in the order implied by the program. Refer to Section 
3.1. 3.2 for details. 


It is also to be noted that the CPU does not check for de- 
pendencies between the fetching of an instruction and the 
writing of previous instructions’ results. Therefore, special 
care is required when executing self-modifying code. 

3.1.3.1 Branch Prediction 

One problem inherent to all pipelined machines is what is 
called "Pipeline Breakage”. 

This occurs every time the sequentiality of the instructions is 
broken, due to the execution of certain instructions or the 
occurrence of exceptions. 

The result of a pipeline breakage is a performance degrada- 
tion, due to the fact that a certain portion of the pipeline 
must be flushed and new data must be brought in. 

The NS32GX32 provides a special mechanism, called 
branch prediction, that helps minimize this performance 
penalty. 

When a conditional branch instruction is decoded in the ear- 
ly stages of the pipeline, a prediction on the execution of the 
instruction is performed. 

More precisely, the prediction mechanism predicts back- 
ward branches as taken and forward branches as not taken, 
except for the branch instructions BLE and BNE that are 
always predicted as taken. 

Thus, the resulting probability of correct prediction is fairly 
high, especially for branch instructions placed at the end of 
loops. 

The sequence of operations performed by the loader and 
execution units in the CPU is given below: 

• Loader detects branches and calculates destination ad- 
dresses 

• Loader uses branch opcode and direction to select be- 
tween sequential and non-sequential streams 

• Loader saves address for alternate stream 

• Execution unit resolves branch decision 

Due to the branch predicition, some special care is required 
when writing self-modifying code. Refer to the appropriate 
section in Appendix B for more information on this subject. 

3.1.3.2 Memory-Mapped I/O 

The characteristics of certain peripheral devices and the 
overlapping of instruction execution in the pipeline of the 
NS32GX32 require that special handling be applied to mem- 
ory-mapped I/O references. I/O references differ from 
memory references in two significant ways, imposing the 
following requirements: 

1. Reading from a peripheral port can alter the value read 
on the next reference to the same port or another port in 
the same device. (A characteristic called here “destruc- 
tive-reading”.) Serial communication controllers and 
FIFO buffers commonly operate in this manner. As ex- 
plained in “Instruction Pipeline” above, the NS32GX32 
can read the source operands for one instruction while 
the previous instruction is executing. Because the previ- 
ous instruction may cause a trap, an interrupt may be 
recognized, or the flow of control may be otherwise al- 
tered, it is a requirement that destructive-reading of 
source operands before the execution of an instruction 
be avoided. 
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3.0 Functional Description (Continued) 

2. Writing to a peripheral port can alter the value read from 
another port of the same device. (A characteristic called 
here “side-effects of writing”). For example, before read- 
ing the counter's value from the NS32202 Interrupt Con- 
trol Unit it is first necessary to freeze the value by writing 
to another control register. 

However, as mentioned above, the NS32GX32 can read the 
source operands for one instruction before writing the re- 
sults of previous instructions unless the addresses indicate 
a dependency between the read and write references. Con- 
sequently, it is a requirement that read and write references 
to peripheral that exhibit side-effects of writing must occur in 
the order dictated by the instructions. 

The NS32GX32 supports 2 methods for handling memory- 
mapped I/O. The first method is more general; it satisfies 
both requirements listed above and places no restriction on 
the location of memory-mapped peripheral devices. The 
second method satisfies only the requirement for side ef- 
fects of writing, and it restricts the location of memory- 
mapped I/O devices, but it is more efficient for devices that 
do not have destructive-read ports. 

The firs t metho d fo r handli ng memory-mapped I/O uses two 
signals: IOINH and IODEC. When the NS32G X32 ge nerates 
a read bus cycle, it asserts the output signal IOINH if either 
of the I/O requirements listed above is not satisfied. That is, 
IOINH is asserted during a read bus cycle when (1) the read 
reference is for an instruction that may not be executed or 
(2) the read reference occurs while a write reference is 
pending for a previous instruction. When the read reference 
is to a peripheral device that implements ports with destruc- 
tive-reading or side-effects of writing, the input signal 
IODEC must be asse rted; in addition, the device must not 
be selected if lOlNH is active. When the CPU d etects that 
the IODEC input signal is active while the IOINH output sig- 
nal is also active, it discards the data read during the bus 
cycle and serializes instruction execution. See the next sec- 
tion for details on serializing operations. The CPU then gen- 
erates the read bus cycle agai n, this time satisfying the re- 
quirements for I/O and driving lOlNH inactive. 

The second method for handling memory-mapped I/O uses 
a dedicated region of memory. The NS32GX32 treats all 
references to the memory range from address FF000000 to 
address FFFFFFFF inclusive in a special manner. 

While a write to a location in this range is pending, reads 
from locations in the same range are delayed. However, 
reads from locations with addresses lower than FF000000 
may occur. Similarly, reads from locations in the above 
range may occur while writes to locations outside of the 
range are pending. 

It is to be noted that the CPU may assert IOINH even when 
the reference is within the dedicated region. Refer to Sec- 
tion 3.5.8 for more information on the handling of I/O devic- 
es. 

3.1.3.3 Serializing Operations 

After executing certain instructions or processing an excep- 
tion, the CPU serializes instruction execution. Serializing in- 


struction execution means that the CPU completes writing 
all previous instructions’ results to memory, then begins 
fetching and executing the next instruction. 

For example, when a new value is loaded into the PSR by 
executing an LPRW instruction, the pipeline is flushed and a 
serializing operation takes place. This is necessary since 
the privilege level might have changed and the instructions 
following the LPRW instruction must be fetched again with 
the new privilege level. 

The CPU serializes instruction execution after executing one 
of the following instructions: BICPSRW, BISPSRW, BPT, 
CINV, DIA, FLAG (trap taken), LPR (CFG, INTBASE, PSR, 
UPSR, DCR, BPC, DSR, and CAR only), RETT, RETI, and 
SVC. Figure 3-4 shows the memory references after seriali- 
zation. 

Note 1: LPRB UPSR can be executed in User Mode to serialize instruction 
execution. 

Note 2: After an Instruction that writes a result to memory is executed, the 
updating of the result’s memory location may be delayed until the 
next serializing operation. 

Note 3: When reset or a nonrestartable bus error exception occurs, the CPU 
discards any results that have not yet been written to memory. 

INSTRUCTION N INSTRUCTION N»1 

INSTRUCTION FETCH INSTRUCTION FETCH 

/ \ DATA READ 

X \\ 

DATA WRITE DATA WRITE 
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FIGURE 3-4. Memory References after Serialization 
3.1.4 Slave Processor Instructions 

The NS32GX32 recognizes two groups of instructions being 
executable by external slave processors: 

• Floating Point Instructions 

• Custom Slave Instructions 

Each Slave Instruction Set is enabled by a bit in the Configu- 
ration Register (Section 2.1.4). Any Slave Instruction which 
does not have its corresponding Configuration Register bit 
set will trap as undefined, without any Slave Processor com- 
munication attempted by the CPU. This allows software sim- 
ulation of a non-existent Slave Processor. 

3.1.4.1 Slave Instruction Protocol 

Slave Processor instructions have a three-byte Basic In- 
struction field, consisting of an ID Byte followed by an Oper- 
ation Word. The ID Byte has three functions: 

1) It identifies the instruction as being a Slave Processor 
instruction. 

2) It specifies which Slave Processor will execute it. 

3) It determines the format of the following Operation Word 
of the instruction. 

Upon receiving a Slave Processor instruction, the CPU initi- 
ates the sequence outlined in Figure 3-5. While applying 
Status code 11111 (Broadcast ID Section 3.5.4.1), the CPU 
transfers the ID Byte on bits D24-D31, the operation 
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3.0 Functional Description (Continued) 


OPCODE (LOW) OPCODE (HIGH) 


FIGURE 3-6. ID and Operation Word 


FIGURE 3-7. Slave Processor Status Word 


word on bits D8-D23 in a swapped order of bytes and a 
non-used byte XXXXXXXX (X = don't care) on bits D0-D7 
(Figure 3-6). 

All slave processors observe the bus cycle and inspect the 
identification code. The slave selected by the identification 
code continues with the protocol; other slaves wait for the 
next slave instruction to be broadcast. 

After transferring the slave instruction, the CPU sends to the 
slave any source operands that are located in memory or 
the General-Pu rpose re gisters . The CPU then waits for the 
slave to assert SDN or FSSR. While the CPU is waiting, it 
can perform bus cycles to fetch instructions and read 
source operands for instructions that follow the slave in- 
struction being executed. If there are no bus cycles to per- 
form, the CPU is idle with a special Status indicating that it is 
waiting for a slave processor. After the slave asserts SDN or 
FSSR, the CPU follows one of the two sequences described 
below. 

If the slave asserts SDN, then the CPU checks whether the 
instruction stores any results to memory or the General-Pur- 
pose registers. The CPU reads any such results from the 
slave by means of 1 or 2 bus cycles and updates the desti- 
nation. 


If the slave asserts FSSR, then the NS32GX32 reads a 32- 
bit status word from the slave. The CPU checks bit 0 in the 
slave's status word to determine whether to update the PSR 
flags or to process an exception. Figure 3-7 shows the for- 
mat of the slave’s status word. 

If the Q bit in the status word is 0, the CPU updates the N, Z 
and L flags in the PSR. 

If the Q bit in the status word is set to 1 , the CPU processes 
either a Trap (UND) if TS is 1 or a Trap (SLAVE) if TS is 0. 

Note 1: Only the floating-point and custom compare instructions are allowed 
to return a value of 0 for the Q bit when the FSSR signal is activat- 
ed. All other instructions must always set the Q bit to 1 (to signal a 
Trap), when activating FSSR. 

Note 2: While executing CINV instruction, the CPU displays the operation 
code and source operand using slave processor write bus cycles, as 
described in the protocol above. Nevertheless, the CPU does not 
wait for SDN or FSSR to be asserted while executing these instruc- 
tions. This Information can be used to monitor the contents of the 
on-chip Instruction Cache, and Data Cache. 

Note 3: The slave processor must be ready to accept new slave Instruction 
at any time, even while the slave is executing another instruction or 
waiting for the CPU to read results. 
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3.0 Functional Description (Continued) 

3.1.4.2 Floating Point Instructions 

Table 3-1 gives the protocols followed for each Floating 
Point instruction. The instructions are referenced by their 
mnemonics. For the bit encodings of each instruction, see 
Appendix A. 

The Operand class columns give the Access Class for each 
general operand, defining how the addressing modes are 
interpreted (see Instruction Set Reference Manual). 

The Operand Issued columns show the sizes of the oper- 
ands issued to the Floating Point Unit by the CPU. “D” indi- 
cates a 32-bit Double Word, “i” indicates that the instruction 
specifies an integer size for the operand (B = Byte, W = 
Word, D = Double Word), “f” indicates that the instruction 
specifies a Floating Point size for the operand (F = 32-bit 
Standard Floating, L = 64-bit Long Floating). 

The Returned Value Type and Destination column gives the 
size of any returned value and where the CPU places it. The 
PSR-Bits-Affected column indicates which PSR bits, if any, 
are updated from the Slave Processor Status Word (Figure 
3-7). 

Any operand indicated as being of type “f” will not cause a 
transfer if the Register addressing mode is specified. This is 
because the Floating Point Registers are physically on the 
Floating Point Unit and are therefore available without CPU 
assistance. 

3.1. 4.3 Custom Slave Instructions 

Provided in the NS32GX32 is the capability of communicat- 
ing with a user-defined, “Custom” Slave Processor. The in- 
struction set provided for a Custom Slave Processor defines 
the instruction formats, the operand classes and the com- 
munication protocol. Left to the user are the interpretations 
of the Op Code fields, the programming model of the Cus- 
tom Slave and the actual types of data transferred. The pro- 
tocol specifies only the size of an operand, not its data type. 
Table 3-2 lists the relevant information for the Custom Slave 
instruction set. The designation “c” is used to represent an 
operand which can be a 32-bit (“D”) or 64-bit ("Q") quantity 


in any format; the size is determined by the suffix on the 
mnemonic. Similarly, an “i” indicates an integer size (Byte, 
Word, Double Word) selected by the corresponding mne- 
monic suffix. 

Any operand indicated as being of type “c” will not cause a 
transfer if the register addressing mode is specified. It is 
assumed in this case that the slave processor is already 
holding the operand internally. 

For the instruction encodings, see Appendix A. 

3.2 EXCEPTION PROCESSING 

Exceptions are special events that alter the sequence of 
instruction execution. The CPU recognizes three basic types 
of exceptions: interrupts, traps and bus errors. 

An interrupt o ccur s in response to an event signalled by 
activating the NMI or INT input signals. Interrupts are typi- 
cally requested by peripheral devices that require the CPU’s 
attention. 

Traps occur as a result either of exceptional conditions 
(e.g., attempted division by zero) or of specific instructions 
whose purpose is to cause a trap to occur (e.g., supervisor 
call instruction). 

A bus error exception occurs when the BER signal is acti- 
vated during an instruction fetch or data transfer required by 
the CPU to execute an instruction. 

When an exception is recognized, the CPU saves the PC, 
PSR and optionally the MOD register contents on the inter- 
rupt stack and then it transfers control to an exception serv- 
ice procedure. 

Details on the operations performed in the various cases by 
the CPU to enter and exit the exception service procedure 
are given in the following sections. 

It is to be noted that the reset operation is not treated here 
as an exception. Even though, like any exception, it alters 
the instruction execution sequence. 

The reason being that the CPU handles reset in a signifi- 
cantly different way than it does for exceptions. 

Refer to Section 3.5.3 for details on the reset operation. 
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3.0 Functional Description (Continued) 

TABLE 3-1. Floating Point instruction Protocols 


Mnemonic 

Operand 1 

Operand 2 

Operand 1 

Operand 2 

Returned Value 

PSR Bits 

Class 

Class 

Issued 

Issued 

Type and Dest. 

Affected 

ADDf 

read.f 

rmw.f 

f 

f 

f to Op.2 

none 

SUBf 

read.f 

rmw.f 

f 

f 

f to Op.2 

none 

MULf 

read.f 

rmw.f 

f 

f 

f to Op.2 

none 

DIVf 

read.f 

rmw.f 

f 

f 

f to Op.2 

none 

MOVf 

read.f 

write.f 

f 

N/A 

f to Op.2 

none 

ABSf 

read.f 

write.f 

f 

N/A 

f to Op.2 

none 

NEGf 

read.f 

write.f 

f 

N/A 

f to Op.2 

none 

CMPf 

read.f 

read.f 

f 

f 

N/A 

N.Z.L 

FLOORfi 

read.f 

write.i 

f 

N/A 

i to Op.2 

none 

TRUNCfi 

read.f 

write.i 

f 

N/A 

i to Op.2 

none 

ROUNDfi 

read.f 

write.i 

f 

N/A 

i to Op.2 

none 

MOVFL 

read.F 

write. L 

F 

N/A 

L to Op.2 

none 

MOVLF 

read.L 

write. F 

L 

N/A 

F to Op.2 

none 

MOVif 

read.i 

write.f 

i 

N/A 

f to Op.2 

none 

LFSR 

read.D 

N/A 

D 

N/A 

N/A 

none 

SFSR 

N/A 

write.D 

N/A 

N/A 

D to Op.2 

none 

POLYf 

read.f 

read.f 

f 

f 

f to F0 

none 

DOTf 

read.f 

read.f 

f 

f 

f to F0 

none 

SCALBf 

read.f 

rmw.f 

f 

f 

f to Op.2 

none 

LOGBf 

read.f 

write.f 

f 

N/A 

f to Op.2 

none 



TABLE 3-2. Custom Slave Instruction Protocols 



Mnemonic 

Operand 1 

Operand 2 

Operand 1 

Operand 2 

Returned Value 

PSR Bits 

Class 

Class 

Issued 

Issued 

Type and Dest. 

Affected 

CCALOc 

read.c 

rmw.c 

c 

c 

c to Op.2 

none 

CCALIc 

read.c 

rmw.c 

c 

c 

c to Op.2 

none 

CCAL2c 

read.c 

rmw.c 

c 

c 

c to Op.2 

none 

CCAL3c 

read.c 

rmw.c 

c 

c 

c to Op.2 

none 

CMOVOc 

read.c 

write.c 

c 

N/A 

c to Op.2 

none 

CMOVIc 

read.c 

write.c 

c 

N/A 

c to Op.2 

none 

CMOV2C 

read.c 

write.c 

c 

N/A 

c to Op.2 

none 

CMOV3C 

read.c 

write.c 

c 

N/A 

c to Op.2 

none 

CCMPOc 

read.c 

read.c 

c 

c 

N/A 

N,Z,L 

CCMPIc 

read.c 

read.c 

c 

c 

N/A 

N,Z,L 

CCVOci 

read.c 

write.i 

c 

N/A 

i to Op.2 

none 

CCVIci 

read.c 

write.i 

c 

N/A 

i to Op.2 

none 

CCV2ci 

read.c 

write.i 

c 

N/A 

i to Op.2 

none 

CCV3ic 

read.i 

write.c 

i 

N/A 

c to Op.2 

none 

CCV4DQ 

read.D 

write.Q 

D 

N/A 

Q to Op.2 

none 

CCV5QD 

read.Q 

write.D 

Q 

N/A 

D to Op.2 

none 

LCSR 

read.D 

N/A 

D 

N/A 

N/A 

none 

SCSR 

N/A 

write.D 

N/A 

N/A 

D to Op.2 

none 

LCR* 

read.D 

N/A 

D 

N/A 

N/A 

none 

SCR* 

write.D 

N/A 

N/A 

N/A 

DtoOp.1 

none 


Note: 

D = Double Word 

i = Integer size (B.W.D) specified in mnemonic, 
c = Custom size (D:32 bits or Q:64 bits) specified in mnemonic. 
* = Privileged instruction: will trap if CPU is in User Mode. 

N/A = Not Applicable to this instruction. 
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3.0 Functional Description (Continued) 

3.2.1 Exception Acknowledge Sequence 

When an exception is recognized, the CPU goes through 
three major steps: 

1) Adjustment of Registers. Depending on the source of the 
exception, the CPU may restore and/or adjust the con- 
tents of the Program Counter (PC), the Processor Status 
Register (PSR) and the currently-selected Stack Pointer 
(SP). A copy of the PSR is made, and the PSR is then set 
to reflect Supervisor Mode and selection of the Interrupt 
Stack. Trap (TRC) and Trap (OVF) are always disabled. 
Maskable interrupts are also disabled if the exception is 
caused by an interrupt, Trap (DBG), Trap (ABT) or bus 
error. 

2) Vector Acquisition. A vector is either obtained from the 
data bus or is supplied internally by default. 

3) Service Call. The CPU performs one of two sequences 
common to all exceptions to complete the acknowledge 
process and enter the appropriate service procedure. 
The selection between the two sequences depends on 
whether the Direct-Exception mode is disabled or en- 
abled. 

Direct-Exception Mode Disabled 

The Direct-Exception mode is disabled while the DE bit in 
the CFG register is 0 (Section 2.1.4). In this case the CPU 
first pushes the saved PSR copy along with the contents of 
the MOD and PC registers on the interrupt stack. Then it 


reads the double-word entry from the Interrupt Dispatch ta- 
ble at address ‘INTBASE + vector x 4’. See Figures 3-8 
and 3-9. The CPU uses this entry to call the exception serv- 
ice procedure, interpreting the entry as an external proce- 
dure descriptor. 

A new module number is loaded into the MOD register from 
the least-significant word of the descriptor, and the static- 
base pointer for the new module is read from memory and 
loaded into the SB register. Then the program-base pointer 
for the new module is read from memory and added to the 
most-significant word of the module descriptor, which is in- 
terpreted as an unsigned value. Finally, the result is loaded 
into the PC register. 

Direct-Exception Mode Enabled 

The Direct-Exception mode is enabled when the DE bit in 
the CFG register is set to 1. In this case the CPU first 
pushes the saved PSR copy along with the contents of the 
PC register on the Interrupt Stack. The word stored on the 
Interrupt Stack between the saved PSR and PC register is 
reserved for future use; its contents are undefined. The CPU 
then reads the double-word entry from the Interrupt Dis- 
patch Table at address ‘INTBASE + vector x 4’. The CPU 
uses this entry to call the exception service procedure, inter- 
preting the entry as an absolute address that is simply load- 
ed into the PC register. Figure 3-10 provides a pictorial of 
the acknowledge sequence. It is to be noted that while the 


CASCADE TABLE 



CASCADE ADDR 14 


2 RESERVED 


NON-VECTORED INTERRUPT 


NON-MASKABLE INTERRUPT 


SLAVE PROCESSOR TRAP 


INTERRUPT BASE 
REGISTER 


CASCAOEADDR15 


FIXED INTERRUPTS 
AND TRAPS 


VECTORED 

INTERRUPTS 


OISPATCH TABLE 


ILLEGAL OPERATION TRAP 


SUPERVISOR CALL TRAP 


DIVIDE BV ZERO TRAP 


IS RESERVED 


VECTORED 

INTERRUPTS 


BREAKPOINT TRAP 


UNDEFINED INSTRUCTION TRAP 


RESTARTABLE BUS ERROR 


NON-RESTARTABLE BUS ERROR 


INTEGER OVERFLOW TRAP 


FIGURE 3-8. Interrupt Dispatch Table 
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direct-exception mode is enabled, the CPU can respond 
more quickly to interrupts and other exceptions because 
fewer memory references are required to process an excep- 
tion. The MOD and SB registers, however, are not initialized 
before the CPU transfers control to the service procedure. 
Consequently, the service procedure is restricted from exe- 
cuting any instructions, such as CXP, that use the contents 
of the MOD or SB registers in effective address calcula- 
tions. 


mode procedures, RETT can also adjust the Stack Pointer 
(SP) to discard a specified number of bytes from the original 
stack as surplus parameter space. 

RETI is used to return from a maskable interrupt service 
procedure. A difference of RETT, RETI also informs any 
external interrupt control units that interrupt service has 
completed. Since interrupts are generally asynchronous ex- 
ternal events, RETI does not discard parameters from the 
stack. 


3.2.2 Returning from an Exception Service Procedure 

To return control to an interrupted program, one of two in- 
structions can be used: RETT (Return from Trap) and RETI 
(Return from Interrupt). 

RETT is used to return from any trap, non-maskable inter- 
rupt or bus error service procedure. Since some traps are 
often used deliberately as a call mechanism for supervisor 


Both of the above instructions always restore the Program 
Counter (PC) and the Processor Status Register from the 
interrupt stack. If the Direct-Exception mode is disabled, 
they also restore the MOD and SB register contents. Fig- 
ures 3-11 and 3-12 show the RETT and RETI instruction 
flows when the Direct-Exception mode is disabled. 
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3.0 Functional Description (Continued) 
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FIGURE 3-11. Return from Trap (RETT n) Instruction Flow. 
Direct-Exception Mode Disabled. 


3.2.3 Maskable Interrupts 

The iNT pin is a level-sensitive input. A continuous low level 
is allowed for generating multiple interrupt requests. The in- 
put is maskable, and is therefore enabled to generate inter- 
rupt requests only while the Processor Status Register I bit 
is se t. Th e I bit is automatically cleared during service of an 
INT, NMI, Trap (DBG), or Bus Error request, and is restored 
to its original setting upon return from the interrupt service 
routine via the RETT or RETI instruction. 

The InT pin may be configured via the SETCFG instruction 
as either Non-Vectored (CFG Register bit I = 0) or Vec- 
tored (bit 1 = 1). 

3.2.3. 1 Non-Vectored Mode 

In the Non-Vectored mode, an interrupt request on the TnT 
pin will cause an Interrupt Acknowledge bus cycle, but the 
CPU will ignore any value read from the bus and use instead 
a default vector of zero. This mode is useful for small sys- 
tems in which hardware interrupt prioritization is unneces- 
sary. 


3. 2.3.2 Vectored Mode: Non-Cascaded Case 

In the Vectored mode, the CPU uses an Interrupt Control 
Unit (ICU) to prioritize many interrupt requests. Upon receipt 
of an interrupt request on the INT pin, the CPU performs an 
"Interrupt Acknowledge, Master” bus cycle (Section 
3.5.4. 6) reading a vector value from the low-order byte of 
the Data Bus. This vector is then used as an index into the 
Dispatch Table in order to find the External Procedure De- 
scriptor for the proper interrupt service procedure. The serv- 
ice procedure eventually returns via the Return from Inter- 
rupt (RETI) instruction, which performs an End of Interrupt 
bus cycle, informing the ICU that it may re-prioritize any in- 
terrupt requests still pending. The ICU provides the vector 
number again, which the CPU uses to determine whether it 
needs also to inform a Cascaded ICU (see below). 

In a system with only one ICU (16 levels of interrupt), the 
vectors provided must be in the range of 0 through 127; that 
is, they must be positive numbers in eight bits. By providing 
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3.0 Functional Description (Continued) 

a negative vector number, an ICU flags the interrupt source 
as being a Cascaded ICU (see below). 

3.2.3.3 Vectored Mode: Cascaded Case 

In order to allow more levels of interrupt, provision is made 
in the CPU to transparently support cascading. Note that 
the Interrupt output from a Cascaded ICU goes to an Inter- 
rupt Request input of the Master ICU, which is the only ICU 
which drives the CPU INT pin. Refer to the ICU data sheet 
for details. 

In a system which uses cascading, two tasks must be per- 
formed upon initialization: 

1) For each Cascaded ICU in the system, the Master ICU 
must be informed of the line number on which it receives 
the cascaded requests. 

2) A Cascade Table must be established in memory. The 
Cascade Table is located in a NEGATIVE direction from 
the location indicated by the CPU Interrupt Base (INT- 
BASE) Register. Its entries are 32-bit addresses, pointing 
to the Vector Registers of each of up to 16 Cascaded 
ICUs. 

Figure 3-9 illustrates the position of the Cascade Table. To 
find the Cascade Table entry for a Cascaded ICU, take its 
Master ICU line number (0 to 15) and subtract 16 from it, 
giving an index in the range -16 to — 1. Multiply this value 
by 4, and add the resulting negative number to the contents 
of the INTBASE Register. The 32-bit entry at this address 
must be set to the address of the Hardware Vector Register 
of the Cascaded ICU. This is referred to as the “Cascade 
Address.” 

Upon receipt of an interrupt request from a Cascaded ICU, 
the Master ICU interrupts the CPU and provides the nega- 
tive Cascade Table index instead of a (positive) vector num- 
ber. The CPU, seeing the negative value, uses it as an index 
into the Cascade Table and reads the Cascade Address 
from the referenced entry. Applying this address, the CPU 
performs an “Interrupt Acknowledge, Cascaded” bus cycle, 
reading the final vector value. This vector is interpreted by 
the CPU as an unsigned byte, and can therefore be in the 
range of 0 through 255. 

In returning from a Cascaded interrupt, the service proce- 
dure executes the Return from Interrupt (RETI) instruction, 
as it would for any Maskable Interrupt. The CPU performs 
an “End of Interrupt, Master” bus cycle, whereupon the 
Master ICU again provides the negative Cascade Table in- 
dex. The CPU, seeing a negative value, uses it to find the 
corresponding Cascade Address from the Cascade Table. 
Applying this address, it performs an “End of Interrupt, Cas- 
caded” bus cycle, informing the Cascaded ICU of the com- 
pletion of the service routine. The byte read from the Cas- 
caded ICU is discarded. 

Note: If an interrupt must be masked off, the CPU can do so by setting the 
corresponding bit In the interrupt mask register of the interrupt con- 
troller. 

However, if an interrupt is set pending during the CPU instruction that 
masks off that interrupt, the CPU may still perform an interrupt ac- 
knowledge cycle following that instruction since it might have sampled 
the TFJT line before the ICU deasserted it. This could cause the ICU to 
provide an invalid vector. To avoid this problem the above operation 
should be performed with the CPU interrupt disabled. 


3.2.4 Non-Maskable Interrupt 

The Non-Maskable Interru pt is triggered whenever a falling 
edge is detected on the NMI pin. The CPU performs an 
“Interrupt Acknowledge, Master” bus cycle (Section 
3.5.4.6) when processing of this interrupt actually begins. 
The Interrupt Acknowledge cycle differs from that provided 
for Maskable Interrupts in that the address presented is 
FFFFFFOOie- The vector value used for the Non-Maskable 
Interrupt is taken as 1 , regardless of the value read from the 
bus. 

The service procedure returns from the Non-Maskable In- 
terrupt using the Return from Trap (RETT) instruction. No 
special bus cycles occur on return. 


Traps are processing exceptions that are generated as di- 
rect results of the execution of an instruction. 

The return address saved on the stack by any trap except 
T rap (TRC) and T rap (DBG) is the address of the first bye of 
the instruction during which the trap occurred. 

When a trap is recognized, maskable interrupts are not dis- 
abled except for the case of Trap (DBG). 

There are 10 trap conditions recognized by the NS32GX32 
as described below. 

Trap (SLAVE): An exceptional condition was detected by 
the Floating Point Unit or another Slave Processor during 
the execution of a Slave Instruction. This trap is requested 
via the Status Word returned as part of the Slave Processor 
Protocol (Section 3. 1.4.1). 

Trap (ILL): Illegal operation. A privileged operation was at- 
tempted while the CPU was in User Mode (PSR bit U = 1). 
Trap (SVC): The Supervisor Call (SVC) instruction was exe- 
cuted. 

Trap (DVZ): An attempt was made to divide an integer by 
zero. (The FPU trap is used for Floating Point division by 
zero.) 

Trap (FLG): The FLAG instruction detected a “1" in the 
PSR F bit. 

Trap (BPT): The Breakpoint (BPT) instruction was execut- 
ed. 

Trap (TRC): The instruction just completed is being traced. 
Refer to Section 3.3.1 for details. 

Trap (UND): An Undefined-Instruction trap occurs when an 
attempt to execute an instruction is made and one or more 
of the following conditions is detected: 

1. The instruction is undefined. Refer to Appendix A for a 
description of the codes that the CPU recognizes to be 
undefined. 

2. The instruction is a floating point instruction and the F-bit 
in the CFG register is 0. 

3. The instruction is a custom slave instruction and the C-bit 
in the CFG register is 0. 

4. The reserved general adressing mode encoding (10011) 
is used. 

5. Immediate addressing mode is used for an operand that 
has access class different from read. 


3.2.5 Traps 
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3.0 Functional Description (Continued) 

6. Scaled Indexing is used and the basemode is also Scaled 
Indexing. 

7. The instruction is a floating-point or custom slave instruc- 
tion that the FPU or custom slave detects to be unde- 
fined. Refer to Section 3. 1.4.1 for more information. 

Trap (OVF): An Integer-Overflow trap occurs when the V-bit 
in the PSR register is set to 1 and an Integer-Overflow con- 
dition is detected during the execution of an instruction. An 
Integer-Overflow condition is detected in the following cas- 
es: 

1. The F-flag is 1 following execution of an ADDi, ADDQi, 
ADDCi, SUBi, SUBCi, NEGi, ABSi, or CHECKi instruction. 

2. The product resulting from a MULi instruction cannot be 
represented exactly in the destination operand's location. 

3. The quotient resulting from a DEIi, DIVi, or QUOi instruc- 
tion cannot be represented exactly in the destination op- 
erand’s location. 

4. The result of an ASHi instruction cannot be represented 
exactly in the destination operand’s location. 

5. The sum of the ‘INC’ value and the ‘INDEX’ operand for 
an ACBi instruction cannot be represented exactly in the 
index operand’s location. 

Trap (DBG): A debug trap occurs when one or more of the 
conditions selected by the settings of the bits in the DCR 
register is detected. T his tra p can also be requested by acti- 
vating the input signal DBG. Refer to Section 3.3.2 for more 
information. 

Note 1: Following execution of the WAIT instruction, then a Trap (DBG) can 
be pending for a PC-match condition. In such an event, the Trap 
(DBG) is processed immediately. 

Note 2: If an attempt is made to execute a privileged custom instruction 
while in User-Mode and the C-bit in the CFG register is 0, then Trap 
(UND) occurs. 

Note 3: While operating in User-Mode, if an attempt is made to execute a 
privileged instruction with an undefined use of a general addressing 
mode (either the reserved encoding is used or else scaled-index or 
immediate modes are incorrectly used), the Trap (UND) occurs. 
Note 4: If an undefined instruction or illegal operation is detected, then no 
data references are performed for the instruction. 

Note 5: For certain instructions that are relatively long to execute, such as 
DEID, the CPU checks for pending interrupts during execution of the 
instruction. In order to reduce interrupt latency, the NS2532 can 
suspend executing the instruction and process the interrupt. Refer 
to Section B.5 in Appendix B for more information about recognizing 
interrupts in this manner. 

3.2.6 Bus Errors 

A bus error exception occurs when the BER signal is assert- 
ed in response to an instruction fetch or data transfer that is 
required to execute an instruction. 

Two types of bus errors are recognized: Restartable and 
Non-Restartable. Restartable bus errors are recognized dur- 
ing read bus cycles. All other bus errors are non-restartable. 
The CPU responds to restartable bus errors by suspending 
the instruction that it was executing. When a non-restartable 
bus error is detected, the CPU responds immediately and 
the instruction being executed is terminated. 


In this case, any results that have not yet been written to 
memory are discarded, and any pending traps other than 
Trap (DBG) for external condition, are eliminated. The PC 
value saved on the stack is undefined. 

The NS32GX32 does not respond to bus errors indicated 
for instructions that are not executed. For example, n o bus 
error exception occurs in response to asserting the BER 
signal during a bus cycle to prefetch an instruction that is 
not executed because the previous instruction caused a 
trap. 

If a bus error is detected during a data transfer required for 
the processing of another exception or during the ICU read 
cycle of a RETI instruction, then the CPU considers it as a 
fatal bus error and enters the ‘HALTED’ state. 

Note 1: If tho address and control signals associated with the last bus cycle 
that caused a bus error are latched by external hardware, then the 
information they provide can be used by the service procedure for 
restartable bus errors to analyze and resolve the exception recog- 
nized by the CPU. This can be accomplished because upon detect- 
ing a restartable bus error, the NS32GX32 stops making memory 
references for subsequent instructions until it determines whether 
the instruction that caused the bus error is executed and the excep- 
tion is processed. 

Note 2: When a non-restartable bus error is recognized, the service proce- 
dure must execute the CINV instruction to invalidate the on-chip 
caches. This is necessary to maintain coherence between them and 
external memory. 

3.2.7 Priority Among Exceptions 

The CPU checks for specific exceptions at various points 
while executing an instruction. It is possible that several ex- 
ceptions occur simultaneously. In that event, the CPU re- 
sponds to the exception with highest priority. 

Figure 3-13 shows an exception processing flowchart. A 
non-restartable bus error is assigned highest priority and is 
serviced immediately regardless of the execution state of 
the CPU. 

Before executing an instruction, the CPU checks for pend- 
ing Trap (DBG), interrupts, and Trap (TRC), in that order. If a 
Trap (DBG) is pending, then the CPU processes that excep- 
tion, otherwise the CPU checks for pending interrupts. At 
this point, the CPU responds to any pending interrupt re- 
quests; nonmaskable interrupts are recongized with higher 
priority than maskable interrupts. If no interrupts are pend- 
ing, then the CPU checks the P-flag in the PSR to determine 
whether a Trap (TRC) is pending. If the P-flag is 1, a Trap 
(TRC) is processed. If no Trap (DBG), interrupt or Trap 
(TRC) is pending, the CPU begins executing the instruction. 
While executing an instruction, the CPU may recognize up 
to three exceptions: 

1 . restartable bus error 

2. trap (DBG) or interrupt, if the instruction is interruptible 

3. one of 7 mutually exclusive traps: SLAVE, ILL, SVC, DVZ, 
FLG, BPT, UND 

If no exception is detected while the instruction is executing, 
then the instruction is completed and the PC is updated to 
point to the next instruction. If a Trap (OVF) is detected, 
then it is processed at this time. 
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3.0 Functional Description (Continued) 
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3.0 Functional Description (Continued) 

While executing the instruction, the CPU checks for enabled 
debug conditions. If an enabled debug condition is met, a 
Trap (DBG) is held pending until after the instruction is com- 
pleted (see Note 3). If another exception is detected before 
the instruction is completed, the pending Trap (DBG) is re- 
moved and the DSR register is not updated. 

Note 1: Trap (DBG) can be detected simultaneously with Trap (OVF). In this 
event, the Trap (OVF) is processed before the Trap (DBG). 

Note 2: An address-compare debug condition can be detected while pro- 
cessing a bus error, interrupt, or trap. In this event, the Trap (DBG) 
is held pending until after the CPU has processed the first excep- 
tion. 

Note 3: Between operations of a string instruction, the CPU responds to 
pending operand address compare and external debug conditions 
as well as interrupts. If a PC-match debug condition is detected 
while executing a string instruction, then Trap (DBG) is held pending 
until the instruction has completed. 

3.2.8 Exception Acknowledge Sequences: Detailed Flow 

For purposes of the following detailed discussion of excep- 
tion acknowledge sequences, a single sequence called 
“service” is defined in Figure 3-14. 

Upon detecting any interrupt request, trap or bus error con- 
dition, the CPU first performs a sequence dependent upon 
the type of exception. This sequence will include saving a 
copy of the Processor Status Register and establishing a 
vector and a return address. The CPU then performs the 
service sequence. 

3.2.8. 1 Maskable/Non-Maskable Interrupt Sequence 

This sequence is performed by the CPU when the NMI pin 
receives a falling edge, or the InT pin becomes active with 
the PSR I bit set. The interrupt sequence begins either at 
the next instruction boundary or, in the case of an interrupt- 
ible instruction (e.g., string instruction), at the next interrupt- 
ible point during its execution. 

1. If an interruptible instruction was interrupted and not yet 
completed: 

a. Clear the Processor Status Register P bit. 

b. Set “Return Address” to the address of the first byte of 
the interrupted instruction. 

Otherwise, set “Return Address" to the address of the 
next instruction. 

2. Copy the Processor Status Register (PSR) into a tempo- 
rary register, then clear PSR bits T, V, U, S, P and I. 

3. If the interrupt is Non-Maskable: 

a. Read a byte from address FFFFFFOO 16 . applying 
Status Code 00100 (Interrupt Acknowledge, Master). 
Discard the byte read. 

b. Set “Vector” to 1. 

c. Go to Step 8. 

4. If the interrupt is Non-Vectored: 

a. Read a byte from address FFFFFEOO 16 . applying 
Status Code 00100 (Interrupt Acknowledge, Master). 
Discard the byte read. 

b. Set “Vector” to 0. 

c. Go to Step 8. 

5. Here the interrupt is Vectored. Read “Byte” from address 
FFFFFEOO 16 , applying Status Code 00100 (Interrupt Ac- 
knowledge, Master). 

6. If “Byte" ^ 0, then set “Vector” to “Byte" and go to Step 

8. 


7. If “Byte” is in the range -16 through -1, then the inter- 
rupt source is Cascaded. (More negative values are re- 
served for future use.) Perform the following: 

a. Read the 32-bit Cascade Address from memory. The 
address is calculated as INTBASE + 4* Byte. 

b. Read “Vector,” applying the Cascade Address just 
read and Status Code 00101 (Interrupt Acknowledge, 
Cascaded). 

8. Perform Service (Vector, Return Address), Figure 3-14. 

3. 2.8.2 Restartable Bus Error Sequence 

1. Suspend instruction and restore the currently selected 
Stack Pointer to its original contents at the beginning of 
the instruction. 

2. Clear the PSR P bit. 

3. Copy the PSR into a temmporary register, then clear PSR 
bits T, V, U, S and I. 

4. Set “Vector” to 1 1 . 

5. Set “Return Address” to the address of the first byte of 
the suspended instruction. 

6. Perform Sen/ice (Vector, Return Address), Figure 3-14. 

3.2.8.3 SLAVE/ILL/SVC/DVZ/FLG/BPT/UND Trap 
Sequence 

1. Restore the currently selected Stack Pointer and the 
Processor Status Register to their original values at the 
start of the trapped instruction. 

2. Set “Vector” to the value corresponding to the trap type. 

SLAVE: Vector = 3. 

ILL: Vector = 4. 

SVC: Vector = 5. 

DVZ: Vector = 6. 

FLG: Vector = 7. 

BPT: Vector = 8. 

UND: Vector =10. 

3. If Trap (ILL) or Trap (UND) 

a. Clear the Processor Status Register P bit. 

4. Copy the Processor Status Register (PSR) into a tempo- 
rary register, then clear PSR bits T, V, U, S and P. 

5. Set "Return Address” to the address of the first byte of 
the trapped instruction. 

6. Perform Service (Vector, Return Address), Figure 3-14. 

3. 2. 8. 4 Trace Trap Sequence 

1. In the Processor Status Register (PSR), clear the P bit. 

2. Copy the PSR into a temporary register, then clear PSR 
bits T, V, U and S. 

3. Set “Vector" to 9. 

4. Set “Return Address” to the address of the next instruc- 
tion. 

5. Perform Service (Vector, Return Address), Figure 3-14. 

3.2.8.5 Integer-Overflow Trap Sequence 

1. Copy the PSR into a temporary register, then clear PSR 
bits T, V, U, S and P. 

2. Set “Vector” to 13. 
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3.0 Functional Description (Continued) 

3. Set “Return Address” to the address of the next instruc- 
tion. 

4. Perform Service (Vector, Return Address), Figure 3-14. 

3.2.8.6 Debug Trap Sequence 

A debug condition can be recognized either at the next in- 
struction boundary or, in the case of an interruptible instruc- 
tion, at the next interruptible point during its execution. 

1 . If PC-match condition, then go to Step 3. 

2. If a String instruction was interrupted and not yet com- 
pleted: 

a. Clear the Processor Status Register P bit. 

b. Set “Return Address” to the address of the first byte of 
the instruction. 

c. Go to Step 4. 

3. Set “Return Address” to the address of the next instruc- 
tion. 

4. Set “Vector” to 14. 

5. Copy the Processor Status Register (PSR) into a tempo- 
rary register, then clear PSR bits T, V, U, S, P and I. 

6. Perform Service (Vector, Return Address), Figure 3-14. 

3.2.8.7 Non-Restartable Bus Error Sequence 

1. Set “Vector” to 12. 

2. Set “Return Address” to “Undefined”. 

3. Copy the Processor Status Register (PSR) into a tempo- 
rary register, then clear PSR bits T, V, U, S, P and I. 

4. Perform a dummy read of the Slave Status Word to reset 
the Slave Processor. 

5. Perform Service (Vector, Return Address), Figure 3-14. 


3.3 DEBUGGING SUPPORT 

The NS32GX32 provides serveral features to assist in pro- 
gram debugging. 

Besides the Breakpoint (BPT) instruction that can be used 
to generate soft breaks, the CPU also provides instruction 
tracing as well as debug trap (or hardware breakpoints) ca- 
pabilities. Details on these features are provided in the fol- 
lowing sub-sections. 

3.3.1 Instruction Tracing 

Instruction tracing is a very useful feature that can be used 
during debugging to single-step through selected portions of 
a program. Tracing is enabled by setting the T-bit in the PSR 
Register. When enabled, the CPU generates a Trace Trap 
(TRC) after the execution of each instruction. 

At the beginning of each instruction, the T bit is copied into 
the PSR P (Trace “Pending”) bit. If the P bit is set at the end 
of an instruction, then the Trace Trap is activated. If any 
other trap or interrupt request is made during a traced in- 
struction, its entire service procedure is allowed to complete 
before the Trace Trap occurs. Each interrupt and trap se- 
quence handles the P bit for proper tracing, guaranteeing 
only one Trace Trap per instruction, and guaranteeing that 
the Return Address pushed during a Trace Trap is always 
the address of the next instruction to be traced. 

Due to the fact that some instructions can clear the T and P 
bits in the PSR, in some cases a Trace Trap may not occur 
at the end of the instruction. This happens when one of the 
privileged instructions BICPSRW or LPRW PSR is executed. 


Exception 

Instruction 

Cleared Before 

Cleared After 

Ending 

Saving PSR 

Saving PSR 

Restartable Bus Error 

Suspended 

P 

TVUSI 

Nonrestartable Bus Error 

Terminated 

Undefined 

TVUSPI 

Interrupt 

Before Instruction 

None/P* 

TVUSPI 

ILL, UND 

Suspended 

P 

TVUS 

SLAVE, SVC, DVZ, FLG, BPT 

Suspended 

None 

TVUSP 

OVF 

Completed 

None 

TVUSP 

TRC 

Before Instruction 

P 

TVUS 

DBG 

Before Instruction 

None/P* 

TVUSPI 


•Note: The P bit of the saved PSR is cleared in case the exception is acknowledged before the instruction is completed (e.g., interrupted string instruction). This is 
to avoid a mid-instruction trace trap upon return from the Exception Service Routine. 


Service (Vector, Return Address): 

1) Push the PSR copy onto the Interrupt Stack as a 16-bit value. 

2) If Direct-Exception mode Is selected, then go to step 4. 

3) Push MOD Register Into the Interrupt Stack as a 16-blt value. 

4) Read 32-blt Interrupt Dispatch Table (IDT) entry at address 'INTBASE + vector X 4'. 

5) If Direct-Exception mode Is selected, then go to Step 10. 

6) Move the L.S. word of the IDT entry (Module Field) Into the MOD register. 

7) Read the Program Base pointer from memory address ‘MOD + 8’, and add to it the M.S. word of the IDT entry (Offset Field), placing the result in the 
Program Counter. 

8) Read the new Static Base pointer from the memory address contained In MOD, placing it Into the SB Register. 

9) Go to Step 11. 

10) Place IDT entry in the Program Counter. 

11) Push the Return Address onto the Interrupt Stack as a 32-bit quantity. 

12) Serialize: Non-sequentlally fetch first instruction of Exception Service Routine. 

Note: Some of the Memory Accesses indicated in the service sequence may be performed in an order different from the one shown. 

FIGURE 3-14. Service Sequence 
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3.0 Functional Description (Continued) 

In other cases, it is still possible to guarantee that a Trace 
Trap occurs at the end of the instruction, provided that spe- 
cial care is taken before returning from the Trace Trap Serv- 
ice Procedure. In case a BICPSRB instruction has been ex- 
ecuted, the service procedure should make sure that the T 
bit in the PSR copy saved on the Interrupt Stack is set be- 
fore executing the RETT instruction to return to the program 
begin traced. If the RETT or RETI instructions have to be 
traced, the Trace Trap Service Procedure should set the P 
and T bits in the PSR copy on the Interrupt Stack that is 
going to be restored in the execution of such instructions. 

Note: If instruction tracing is enabled while the WAIT instruction is executed, 
the Trap (TRC) occurs after the next interrupt, when the interrupt 
service procedure has returned. 

3.3.2 Debug Trap Capability 

The CPU recognizes three different conditions to generate a 
Debug T rap: 

1 ) Address Compare 

2) PC Match 

3) External 

These conditions can be enabled and monitored through 
the CPU Debug Registers. 

An address-compare condition is detected when certain 
memory locations are either read or written. The double- 
word address used for the comparison is specified in the 
CAR Register. The address-compare condition can be sep- 
arately enabled for each of the bytes in the specified dou- 
ble-word, under control of the CBE bits of the DCR Register. 
The VNP bit in the DCR controls whether virtual or physical 
addresses are compared. The CRD and CWR bits in the 
DCR separately enable the address compare condition for 
read and write references; the CAE bit in the DCR can be 
used to disable the compare-address condition indepen- 
dently from the other control bits. The CPU examines the 
address compare condition for all data reads and writes, 
reads of memory locations for effective address calcula- 
tions, Interrupt-Acknowledge and End-of-lnterrupt bus cy- 
cles, and memory references for exception processing. 

The PC-match condition is detected when the address of 
the instruction equals the value specified in the BPC regis- 
ter. The PC-match condition is enabled by the PCE bit in the 
DCR. 

Detection of address-compare and PC-match conditions is 
enabled for User and Supervisor Modes by the UD and SD 
bits in the DCR. The DEN-bit can be used to disable detec- 
tion of these two conditions independently from the other 
control bits. 

An external condition is recognized whenever the DBG sig- 
nal is activated. 

When the CPU detects an address-compare or PC-match 
condition while executing an instruction or processing an 
exception, then Trap (DBG) occurs if the TR bit in the DCR 
is 1. When an external debug condition is detected, Trap 
(DBG) occurs regardless of the TR bit. The cause of the 
Trap (DBG) is indicated in the DSR Register. 

When an address-compare or PC-match condition is detect- 
ed while executing an instruction, the CPU asserts the BP 


signal at the beginning of the next instruction, synchronous- 
ly with PFS. If the instruction is not_completed because a 
higher priority trap is detected, the BP signal may or may not 
be asserted. 

Note 1: The assertion of BP is not affected by the setting of the TR bit in the 
DCR register. 

Note 2: While executing the MOVUS and MOVSU instructions, the com- 
pare-address condition is enabled for the User space memory refer- 
ence under control of the UD-bit in the DCR. 

Note 3: When the LPRi instruction is executed to load a new value into the 
BPC, CAR or DCR, it is undefined whether the address-compare 
and PC-match conditions, in effect while executing the instruction, 
are detected under control of the old or new contents of the loaded 
register. Therefore, any LPRi instruction that alters the control of the 
address-compare or PC-match conditions should use register or im- 
mediate addressing mode for the source operand. 

3.4 ON-CHIP CACHES 

The NS32GX32 provides two on-chip caches: the Instruc- 
tion Cache (1C) and the Data Cache (DC). 

These are used to hold the contents of frequently used 
memory locations. 

The 1C and DC can be individually enabled by setting appro- 
priate bits in the CFG Register (See Section 2.1.4). 

The CPU also provides a locking feature that allows the 
contents of the 1C and DC to be locked to specific memory 
locations. This is accomplished by setting the LIC and LDC 
bits in the CFG register. 

Cache locking can be successfully used in real-time applica- 
tions to guarantee fast access to critical instruction and data 
areas. 

Details on the organization and function of each of the 
caches are provided in the following sections. 

Note: The size and organization of the on-chip caches may change in future 
Series 32000 microprocessors. This however, will not affect software 
compatibility. 

3.4.1 Instruction Cache (1C) 

The basic structure of the instruction cache (1C) is shown in 
Figure 3- 15. 

The 1C stores 512 bytes of code in a direct-mapped organi- 
zation with 32 sets. Direct-mapped means that each set 
contains only one block, thus each memory location can be 
loaded into the 1C in only one place. 

Each block contains a 23-bit tag, which holds the most-sig- 
nificant bits of the physical address for the locations stored 
in the block, along with 4 double-words and 4 validity bits 
(one for each double-word). 

A 4-double-word instruction buffer is also provided, which is 
loaded either from a selected cache block or from external 
memory. Instructions are read from this buffer by the loader 
unit and transferred to an 8-byte instruction queue. 

The 1C may or may not be enabled to cache an instruction 
being fetched by the CPU. It is enabled when the 1C bit in 
the CFG Register is set to 1. 

If the 1C is disabled, the CPU bypasses it during the instruc- 
tion fetch and its contents are not affected. The instruction 
is read directly from external memory into the instruction 
buffer. 
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3.0 Functional Description (Continued) 
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FIGURE 3-15. Instruction Cache Structure 


TL/EE/1 0253-21 


When the 1C is enabled, the instruction address bits 4 to 8 
are used to select the 1C set where the instruction may be 
stored. The tag corresponding to the single block in the set 
is compared with the 23 most-significant bits of the instruc- 
tion’s physical address. The 4 double-words in this block are 
loaded into the instruction buffer and the 4 validity bits are 
also retrieved. Bits 2 and 3 of the instruction’s physical ad- 
dress select one of these double-words and the associated 
validity bit. 

If the tag matches and the selected double-word is valid, a 
cache ‘hit’ occurs and the double-word is directly trans- 
ferred to the instruction queue for decoding; otherwise a 
cache ‘miss’ will result. 

In the latter case, if the cache is not locked, the CPU will 
take the following actions. 

First, if the tag of the selected block does not match, the tag 
is loaded with the 23 most-significant bits of the instruction 
address and all the validity bits are cleared. Then, the in- 
struction is read from external memory into the instruction 
buffer. 

If the CNN input signal is not active during the fetching of the 
missing instruction, then the 1C is updated and the instruc- 
tion double-words fetched from memory are stored into it 
with the validity bits set. 

If the cache is locked, its contents are not affected, as the 
CPU reads the missing instruction from external memory. 
Whenever the CPU accesses external memory, whether or 
not the 1C is enabled, it always fetches instruction double- 
words in a non-wrap-around fashion. Refer to Sections 

3. 5.4.3 and 3.5.6 for more information. 

The contents of the instruction cache can be invalidated by 
software through the CINV instruction. Refer to Section 

3.4.3 for details. Clearing the 1C bit in the CFG Register also 
invalidates the instruction cache. Refer to Section C.2 for 
information on loading the CFG register. 


Note: If the 1C is enabled for a certain instruction and a ‘miss’ occurs due to 
a tag mismatch, the CPU will clear all the validity bits of the selected 
tag before fetching the instruction from external memory. If the CNN 
input signal is activated during the fetching of that instruction, the 
validity bits are not set and the 1C is not updated. 

3.4.2 Data Cache (DC) 

The Data Cache (DC) stores 1 ,024 bytes of data in a two- 
way set associative organization as shown in Figure 3-16. 
Each of the 32 sets has 2 cache blocks. Each block con- 
tains a 23-bit tag, which holds the most-significant bits of 
the address for the locations stored in the block, along with 
4 double-words and 4 validity bits (one for each double- 
word). 

The DC is enabled for a data read when all of the following 
conditions are satisfied. 

• The DC bit in the CFG Register is set to 1. 

• The reference is not an interlocked read resulting from 
executing a CBITI or SBITI instruction. 

If the DC is disabled, the CPU bypasses it during the data 
read and its contents are not affected. The data is read 
directly from external memory. The DC is also bypassed for 
Interrupt-Acknowledge and End-of-lnterrupt bus cycles. 
When the DC is enabled for a data read, the address bits 4 
to 8 are used to select the DC set where the data may be 
stored. 

The tags corresponding to the two blocks in the set are 
compared to the 23 most-significant bits of the address. Bits 
2 and 3 of the address select one double-word in each 
block and the associated validity bit. 

If one of the tag matches and the selected double-word in 
the corresponding block is valid, a cache ‘hit’ occurs and 
the data is used to execute the instruction; otherwise a 
cache ‘miss’ will result. In the latter case, if the cache is not 
locked, the CPU will take the following actions. 
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DATA ADDRESS DATA 

FIGURE 3-16. Data Cache Structure 
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First, if the tag of either block in the set matches the data 
address, that block is selected for updating. Otherwise, if 
neither tag matches, then the least recently used block is 
selected; its tag is loaded with the 23 most-significant bits of 
the data address, and all the validity bits are cleared. 

Then, the data is read from external memory; up to 4 dou- 
ble-word bits are read into the cache in a wrap-around fash- 
ion. Refer to Sections 3.5.4.3 and 3.5.6 for more informa- 
tion. 

If the CUN and IODEC input signals are both inactive during 
the bus cycles performed to read the missing data, then the 
DC is updated, as each double-word is read from memory, 
and the corresponding validity bit is set. If the cache is 
locked, its contents are not affected, as the CPU reads the 
missing data from external memory. 

The DC is enabled for a data write whenever the DC bit in 
the CFG Register is set to 1, including interlocked writes 
resulting from executing the CBITI and SBITI instructions. 
The DC does not use write allocation. This means that, dur- 
ing a write, if a cache ‘hit’ occurs, the DC is updated, other- 
wise it is unaffected. The data is always written through to 
external memory. 

The contents of the data cache can be invalidated by soft- 
ware through the CINV instruction. Clearing the DC bit in the 
CFG Register also invalidates the data cache. Refer to Sec- 
tion C.2 for information on loading the CFG register. 

Note: If the DC Is enabled for a certain data reference and a “miss" occurs 
due to tag mismatch, the CPU will clear all the validity bits for the least 
recently used tag before reading the data from external memory. If 
either CUN or IODEC are activated during the data read bus cycles, 
the validity bits are not set and the DC is not updated. 


3.4.3 Cache Coherence Support 

The NS32GX32 provides means for maintaining coherence 
between the on-chip caches and external memory. The 
CINV instruction can be executed to invalidate the Instruc- 
tion Cache and/or Data Cache; the CINV instruction can 
also be executed to invalidate a single 16-byte block in ei- 
ther or both caches. 

In hardware, the use of the caches can be inhibited for indi- 
vidual locations using the CIIN input signal. 

Whenever a CINV instruction is executed, the operation 
code and operand appear on the system interface using 
slave processor bus cycles. Thus, invalidations of the on- 
chip caches by software can be monitored externally. 

Note, however, that the software is responsible for commu- 
nicating to the external circuitry the values of the cache en- 
able and lock bits in the CFG Register, since the CPU does 
not generate any special cycle (e.g., Slave Cycle) when the 
CFG Register is loaded. 

3.5 SYSTEM INTERFACE 

This section provides general information on the NS32GX32 
interface to the external world. Descriptions of the CPU re- 
quirements as well as the various bus characteristics are 
provided here. Details on other device characteristics in- 
cluding timing are given in Chapter 4. 
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3.0 Functional Description (Continued) 

3.5.1 Power and Grounding 

The NS32GX32 requires a single 5-volt power supply, ap- 
plied on 21 pins. The logic voltage pins (VCCL1 to VCCL6) 
supply the power to the on-chip logic. The buffer voltage 
pins (VCCB1 to VCCB14) supply the power to the output 
drivers of the chip. The bus clock power pin (VCCCLK) is 
the power supply for the on-chip clock drivers. All the volt- 
age pins should be connected together by a power (VCC) 
plane on the printed circuit board. 

The NS32GX32 grounding connections are made on 20 
pins. The logic ground pins (GNDL1 to GNDL6) are the 
ground pins for the on-chip logic. The buffer ground pins 
(GNDB1 to GNDB13) are the ground pins for the output 
drivers of the chip. The bus clock ground pin (GNDCLK) is 
the ground connection for the on-chip clock drivers. All the 
ground pins should be connected together by a ground 
plane on the printed circuit board. 

Both power and ground connections are shown in Figure 
3-17. 


OTHER V cc 
> CONNECTIONS 
(V cc PUNE) 


OTHER GROUND 

GNDCLK ► CONNECTIONS 

(GND PUNE) 
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FIGURE 3-17. Power and Ground Connections 


3.5.2 Clocking 

The NS32GX32 requires a single-phase input clock signal 
(CLK) with frequency twice the CPU’s operating frequency. 
This clock signal is internally divided by two to generate two 
non-overlapping phases PHI1 and PHI2. One single-phase 
clock signal BCLK in phase with PHI1 and its complement 
BCLK, are also generated and output by the CPU for timing 
reference. 

Following power-on, the phase relationship between BCLK 
and CLK is undefined. Nevertheless, in some systems it 
may be necessary to sy nchron ize the CPU bus timing to an 
external reference. The SYNC input signal can be used to 
initializ e the phase relationship between CLK and BCLK. 
SYNC can also be used to stretch BCLK (Low) while CLK is 
toggling. 

SYNC is sampled on e ach risi ng edge of CLK. As shown in 
Figure 3-18, whenever SYNC is sampled low, BC LK sto ps 
toggling and stays low. On the first rising edge that SYNC is 
sampled high, BCLK is driven high and then toggles on each 
subsequent rising edge of CLK. 

Every rising edge of BCLK defines a transition in the timing 
state (“T-State”) of the CPU. 

One T-State represents the execution of one microinstruc- 
tion within the CPU and/or one step of an external bus 
transfer. 

Note: The CPU requirement on the maximum period of BCLK must be satis- 
fied when SYNC is asserted at times other than reset. 

3.5.3 Resetting 

The RST input pin is used to reset the NS32GX32. The CPU 
samples RST synchronously on the rising edge of BCLK. 
Whenever a low level is detected, the CPU responds imme- 
diately. Any instruction being executed is terminated; any 
results that have not yet been written to memory are dis- 
carded; and any pending bus errors, interrupts, and traps 
are elimi nated . The internal latches for the edge-sensitive 
NMI and DBG signals are cleared. 


UTJi_h-h_hJ i_hj"Lh_r 
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FIGURE 3-18. Bus Clock Synchronization 
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3.0 Functional Description (Continued) 

The CPU stores the PC contents in the R0 Register and the 
PSR contents in the least-significant word of R1 , leaving the 
most-significant word undefined. The PC is then cleared to 0 
and so are all the implemented bits in the PSR, MSR, MCR 
and CFG registers. The DEN-bit in the OCR Register is also 
cleared to 0. After reset, the remaining implemented bits in 
DCR and the contents of all other registers are undefined. 
The CPU begins executing the instruction at Address 0. 

On application of power, RST must be held low for at least 
50 jus after Vcc is stable. This is to ensure that all on-chip 
voltages are completely stable before operation. Whenever 
a Reset is applied, it must also remain active for not less 
than 64 BCLK cycles. See Figures 3-19 and 3-20. 

While in the Reset st ate, t he CP U drives the signals ADS, 
BEO-3, BMT, CONF and HLDA inactive. The data bus is 
floated and the state of all other output signals is undefined. 
Note 1: II HOL D la ac tive at the time RST Is deasserted, the CPU acknowl- 
edges HOLD before performing any bus cycle. 

Note 2: II SYNC Is asserted while the CPU Is being reset, then BCLK does 
not toggle. Conse quently, SYNC must be high for at least 128 CLK 
cycles while RST Is low. 



FIGURE 3-19. Power-On Reset Requirements 
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FIGURE 3-20. General Reset Timing 
3.5.4 Bus Cycles 

The NS32GX32 CPU will perform bus cycles for one of the 
following reasons: 

1. To fetch instructions from memory. 

2. To write or read data to or from memory or peripheral 
devices. Peripheral input and output are memory mapped 
in the Series 32000 family. 

3. To acknowledge an interrupt and allow external circuitry 
to provide a vector number, or to acknowledge comple- 
tion of an interrupt service routine. 

4. To transfer information to or from a Slave Processor. 

In terms of bus timing, cases 1 through 4 above are identi- 
cal. For timing specifications, see Section 4. The only exter- 
nal difference between them is the 5-bit code placed on the 
Bus Status pins (ST0-ST4). Slave Processor cycles differ in 
that separate control signals are applied (Section 3. 5.4.7). 




3.5.4. 1 Bus Status 

The CPU presents five bits of Bus Status information on 
pins ST0-ST4. The various combinations on these pins in- 
dicate why the CPU is performing a bus cycle, or, if it is idle 
on the bus, then why is it idle. 

The Bus Status pins are interpreted as a five-bit value, with 
ST0 the least significant bit. Their values decode as follows: 

00000 The bus is idle because the CPU does not yet need 
to access the bus. 

00001 The bus is idle because the CPU is waiting for an 
interrupt following execution of the WAIT instruc- 
tion. 

00010 The bus is idle because the CPU has halted after 
detecting a bus error while processing an excep- 
tion. 

00011 The bus is idle because the CPU is waiting for a 
Slave Processor to complete executing an instruc- 
tion. 

00100 Interrupt Acknowledge, Master. 

The CPU is reading an interrupt vector to acknowl- 
edge an interrupt request. 

00101 Interrupt Acknowledge, Cascaded. 

The CPU is reading an interrupt vector to acknowl- 
edge a maskable interrupt request from a Cascad- 
ed Interrupt Control Unit. 

00110 End of Interrupt, Master. 

The CPU is performing a read cycle to indicate that 
it is executing a Return from Interrupt (RETI) in- 
struction at the completion of an interrupt’s service 
procedure. 

00111 End of Interrupt, Cascaded. 

The CPU is performing a read cycle from a Cascad- 
ed Interrupt Control Unit to indicate that it is execut- 
ing a Return from Interrupt (RETI) instruction at the 
completion of an interrupt’s service procedure. 

01000 Sequential Instruction Fetch. 

The CPU is fetching the next double-word in se- 
quence from the instruction stream. 

01001 Non-Sequential Instruction Fetch. 

The CPU is fetching the first double-word of a new 
sequence of instruction. This will occur as a result 
of any JUMP or BRANCH, any exception, or after 
the execution of certain instructions. 

01010 Data Transfer. 

The CPU is reading or writing an operand for an 
instruction, or it is referring to memory while pro- 
cessing an exception. 

01011 Read RMW Class Operand. 

The CPU is reading an operand with access class 
of read-modify-write. 

01 100 Read for Effective Address Calculation. 

The CPU is reading a pointer from memory in order 
to calculate an effective address for Memory Rela- 
tive or External addressing modes. 
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3.0 Functional Description (Continued) 

11101 T ransfer Slave Processor Operand. 

The CPU is transferring an operand to or from a 
Slave Processor. 

11110 Read Slave Processor Status. 

The CPU is reading a status word from a slave 
pro cessor after the slave processor has activated 
the FSSR signal. 

11111 Broadcast Slave Processor ID + OPCODE. 

The CPU is initiating the execution of a Slave In- 
struction by transferring the first 3 bytes of the in- 
struction, which specify the Slave Processor identi- 
fication and operation. 

3.5.4.2 Basic Read and Write Cycles 

The sequence of events occurring during a basic CPU ac- 
cess to either memory or peripheral device is shown in Fig- 
ure 3-21 for a read cycle, and Figure 3-22 for a write cycle. 
The cases shown assume that the selected memory or pe- 
ripheral device is capable of communicating with the CPU at 
full speed. If not, then cycle extension may be requested 
through the RDY line. See Section 3.5.4,4. 

A full speed bus cycle is performed in two cycles of the 
BCLK clock, labeled T1 and T 2. For both read and write bus 
cycles the CPU asserts ADS during the first half of T1 indi- 
cating the beginning of the bus cycle. From the beginning of 
T1 until the completion of the bus cycle the CPU drives the 
Address Bus and other relevant control signals as indicated 
in the timing diagrams. For cacheable data read cycles the 
CPU also drives the CASEC signal to indicate the block in 
the DC set where the data will be stored. If the bus cycle is 
not cancelled (e.g., state T2 is entered in the next clock 
cycle), the confirm signal (CONF) is asserted in the m iddle 
of T1. Note that due to a bus cycle cancellation, the BMT 
signal may be asserted at the beginning of T1, and then 
deasserted before the time in which it is guaranteed valid 
(see Section 4.4.2). 

A confirmed bus cycle is completed at the end of T2, unless 
a cycle extension is requested. Following state T2 is either 
state T1 of the next bus cycle, or an idle T-state, if the CPU 
has no bus cycle to perform. 

In case of a read cycle the CPU samples the data bus at the 
end of state T2. 

If a bus exception is detected, the data is ignored. 

For write bus cycles, valid data is output from the middle of 
T1 until the end of the cycle. When a write bus cycle is 
immediately followed by another write cycle, the CPU keeps 
driving the bus with the data related to the previous cycle 
until the middle of state T 1 of the second bus cycle. 

The CPU always inserts an idle state before a write cycle 
when the write immediately follows a confirmed read cycle. 
Note: The CPU can initiate a bus cycle with a T1 -state and then cancel the 
cycle, such as when a Cache hit occurs. In such a case, the CONF 
signal remains High and the BMT signal is driven High; the Tl-state is 
followed by another Tl-state or an idle T-state. 


ANY 
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3.0 Functional Description (Continued) 


ANY 

I T- STATE | T1 | T2 | Tt OR T1 | 

BCLK 

AO -31 

DO -31 

DDIN 
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BMT 

CONF 

RDY 

BRT 
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STO-4, U/S 

TL/EE/1 0253-29 

FIGURE 3-22. Write Cycle 
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3.5.4.3 Burst Cycles 

The NS32GX32 is capable of performing burst cycles in or- 
der to increase the bus transfer rate. Burst is only available 
in instruction fetch cycles and data read cycle from 32-bit 
wide memories. Burst is not supported in operand write cy- 
cles or slave cycles. 


The sequence of events for burst cycles is shown in Figure 
3-23. The case shown assumes that the selected memory is 
capable of communicating with the CPU at full speed. I f not, 
then cycle extension can be requested through the RDY 
line. See Section 3.5.4.4. 

A Burst cycle is composed of two parts. The first part is a 
regular cycle (opening cycle), in which the CPU outputs the 
new status and asserts all the oth er rele vant control signals. 
In addition, the Burst Out Signal (BOUT) is activated by the 
CPU indicating that the CPU can perform Burst cycles. If the 
selected memory allows Burst cycles, it will notify the CPU 
by activating the burst in signal (BIN). BIN is sampled by the 
CPU in the middle of T2 on the falling edge of BCLK. If the 
memory does not allow bu rst (BIN high), the cycle will termi- 
nate at the end of T2 and BOUT will go inactive immediate- 
ly. If the memor y allow s burst (BIN low), and the CPU has 
not deasserted BOU T, the s econd part of the Burst cycle 
will be performed and BOUT will remain active until termina- 
tion of the Burst. 

The second part consists of up to 3 nibbles, labeled T2B. In 
each of them a data item is read by the CPU. For each 
nibble in the burst sequence the CPU forces the 2 least-sig- 
nificant bits of the address to 0 and increments address bits 
2 and 3 to select the next double-word; all the byte enable 
signals (BEO-3) are activated. 

As show n in F igures 3-23 and 4-8 (in Section 4), the CPU 
samples RDY at the end of each nibble. It extends the ac- 
cess time for the burst transfer if RDY is inactive. 

The CPU initiates burst read cycles in the following cases. 

1. An instruction must be fetched (Status = 01000 or 
01001), and the instruction address does not fall within 
the last double-word in an aligned 16-byte block (e.g., 
address bits 2 and 3 are not both equal to 1). 

2. A data item must be read (Status = 01010, 01011 or 
01 100), and both of the following conditions are met. 

• The data cache is enabled and not locked. (DC = 1 
and LDC = 0 in the CFG register.) 

• The bus cycle is not an interlocked data access per- 
formed while executing a CBITI or SBITI instruction. 

The Burst sequence will be terminated when one of the 
following events occurs. 

1. The last instruction double-word in an aligned 16-byte 
block has been fetched. 

2. The CPU detects that the instructions being prefetched 
are no longer needed due to an alteration of the flow of 
control. This happens, for example, when a Branch in- 
struction is executed or an exception occurs. 

3. 4 double-words of data have been read by the CPU. The 
double-words are transferred within an aligned 16-byte 
block in a wrap-around order. For example, if a source 
operand is located at address 104, then the burst read 
cycle transfers the double-words at 104, 108, 112, and 
100, in that order. 
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3.0 Functional Description (Continued) 
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FIGURE 3-23. Burst Read Cycles 
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3.0 Functional Description (Continued) 

4. The BIN signal is deasserted. 

5. BRT is asserted to signal a bus retry. 

6. IODEC is asserted or the BW0-1 signals indicate a bus 
width other than 32-bits. The CPU samples these signals 
during state T2 of the o pening cycle. During T2B-states 
BW0-1 are ignored and IODEC must be kept HIGH. 

The CPU uses only the values of the above signals sampled 
during the last state of the transfer when the cycle is ex- 
tended. See Section 3.5.4.4. 

Note: A burst sequence is not stopped by the assertion of either BER or 
CIIN. See Note 3 in Section 3.5.5. 

3.5.4.4 Cycle Extension 

To allow sufficient access time for any speed of memory or 
peripheral device, the NS32GX32 provides for extension of 
a bus cycle. Any type of bus cycle except a slave processor 
cycle can be extended. 

A bus cycle can be extended by causing state T2 for a 
normal cycle or state T2B for a Burst cycle to be repeated. 
At the en d of e ach T2 or T2B state, on the rising edge of 
BCLK, the RDY line is sampled by the CPU. If RD Y is active, 
then the transfer cycle will be completed. If RDY is inactive, 
then the bus cycle is extended by repeating the T-state for 
another clock cycle. These additional T-states inserted by 
the CPU in this manner are called ‘WAIT’ states. 

During a transfer the CPU samples the input control signals 
BIN, BER, BRT, BW0-1, CIIN and IODEC. 

When wait states are inserted, only the values of these sig- 
nals sampled during the last wait state are significant. 
Figure 3-24 illustra tes a normal read cycle with wait states 
added through the RDY pin. 

Note: If RST is asserted during a bus cycle, then the cycle Is terminated 
without regard of RDY. 

3.5.4.5 Interlocked Bus Cycles 

The NS32GX32 supports indivisible read-modify-write trans- 
actions by asserting the ILO signal during consecutive read 
and write operations. See Figure 4-7 in Section 4. 
Interlocked transactions are always preceded and followed 
by one or more idle T-states. 

The ILO signal is asserted in the middle of the idle T-state 
preceding state T 1 of the read operation, and is deasserted 
in the middle of one of the idle T-states following completion 
of the write operation, including any retried bus cycles. 

No other bus operations (e.g., instruction fetches) will occur 
while an interlocked transaction is taking place. 

Interlocked transactions are required in multiprocessor sys- 
tems to handle shared resources. The CPU uses them to 
reference data while executing a CBITIi or SBITIi instruction, 
during which a single byte of data is read and written. 

The ILO signal is always released for one or more clock 
cycles in the middle of two consecutive interlocked transac- 
tions. 

Note 1: If a bus error Is detected during an Interlocked read cycle, the sub- 
sequent interlocked write cycle will not be performed, and 103 is 
deasserted before the next bus cycle begins. 


3.5.4.6 Interrupt Control Cycles 

The CPU generates Interrupt-Acknowiedge bus cycles in re- 
sponse to non-maskable interrupt and enabled maskable 
interrupt requests. 

The CPU also generates one or two End-of-lnterrupt bus 
cycles during execution of the Return-from-lnterrupt (RETI) 
instruction. 

The timing for the interrupt control cycles is the same as for 
the basic memory read cycle shown in Figure 3-21 ; only the 
status presented on pins STO-4 is different. These cycles 
are single-byte read cycles, and they always bypass the 
data cache. 

Table 3-4 shows the interrupt control sequences associated 
with each interrupt and with the return from its service pro- 
cedure. 

3.5.4.7 Slave Processor Bus Cycles 

The NS32GX32 performs bus cycles to transfer information 
to or from slave processors while executing floating-point or 
custom-slave instructions. 

The CPU uses slave write bus cycles to broadcast the iden- 
tification and operation codes of a slave instruction as well 
as to transfer operands from memory or general purpose 
registers to a slave. 

Figure 3-25 show s the timing for a slave write bus cycle. 
The CPU asserts SPC during T1; the status is valid during 
T1 and T2. The operation code or operand is output on the 
data bus from the middle of T1 until the end of T2. 

The CPU uses a slave read bus cycle to transfer a result 
operand from a slave to either memory or a general purpose 
register. A slav e read cycle is also used to read a status 
word when the FSSR signal is asserted. Figure 3-26 shows 
the timing for a slave read bus cycle. 

Durin g T1 and T2 the CPU drives the status lines and as- 
serts SPC. The data from the slave is sampled at the end of 
T2. 

The CPU will never perform another slave cycle immediately 
following a slave read cycle. In fact, the T-state following 
state T2 of a slave read cycle is either an idle T-state or the 
T1 state of a memory cycle. 

Slave processor data transfers are always 32 bits wide. If 
the operand is a single byte, then it is transferred on DO 
through D7. If it is a word, then it is transferred on DO 
through D15. 

When two operands are transferred, operand 1 is trans- 
ferred before operand 2. For double-precision operands, the 
least-significant double-word is transferred before the most- 
significant double-word. 

During a slave bus cycle the output signals B EO-3 are un- 
defined while the input signals BW0-1 and RDY are ig- 
nored. 

BER and BRT must be kept high. 
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3.0 Functional Description (Continued) 
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3.0 Functional Description (Continued) 


TABLE 3-4. Interrupt Sequences 


Data Bus 


Cycle Status Address DDIN BE3 BE2 BE1 116 Byte 3 Byte 2 Byte 1 

A. Non-Maskable Interrupt Control Sequences 

Interrupt Acknowledge 

1 00100 FFFFFF00 16 0 1 1 1 0 X X X 

Interrupt Return 

None: Performed through Return from Trap (RETT) instruction. 

B. Non-Vectored Interrupt Control Sequences 

Interrupt Acknowledge 

1 00100 FFFFFE00 16 0 1 1 1 0 X X X 

Interrupt Return 

1 00110 FFFFFE00 16 0 1 1 1 0 X X X 


Interrupt Acknowledge 

1 00100 FFFFFE00 16 

Interrupt Return 

1 00110 FFFFFE00 16 


Interrupt Acknowledge 

1 00100 FFFFFE00 16 


0 1 1 1 0 X 

C. Vectored Interrupt Sequences: Non-Cascaded 


Vector: 

Range: 0-127 

Vector: Same as 
in Previous Int. 
Ack. Cycle 


D. Vectored Interrupt Sequences: Cascaded 


(The CPU here uses the Cascade Index to find the Cascade Address) 
2 001101 Cascade 0 See Note 

Address 

Interrupt Return 

1 00110 FFFFFE00 16 0 1 1 1 


(The CPU here uses the Cascade Index to find the Cascade Address) 
2 001 1 1 Cascade 0 See Note 

Address 

X = Don’t Care 

Note: BE0-BE3 signals will be activated according to the cascaded ICU address 


XXX Cascade Index: 

range -16to -1 

Vector, range 16-255; on appropriate byte of 
data bus. 

XXX Cascade Index: 

Same as in 
previous Int. 

Ack. Cycle 
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3.0 Functional Description (Continued) 

ANY 



FIGURE 3-25. Slave Processor Write Cycle 


3.5.5 Bus Exceptions 

The NS32GX32 has the capability of handling errors occur- 
ring during the execution of a bus cycle. These errors can 
be either correctable or incorrectable, and the CPU ca n be 
notified of th eir occurrence through the input signals BRT 
and/or BER. 

Bus Retry 

If a bus error can be corrected, the CPU may be requested 
to repeat the erron eous bu s cyc le. The request is done by 
asserting the BRT signal. BRT is sampled at the end of 
state T2 or T2B. 

When the CPU detects that BRT is active, it completes the 
bus cycle normally, but ignores the data read in case of a 
read cycle, and maintains a copy of the data to be written in 
case of a write cycle. Then, after a delay of two clock cy- 
cles, it will start executing the bus cycle again. 

If the transfer cycle is multiple (e.g., for non-aligned data), 
only the problematic part will be repeated. 

For instance, if a non-aligned double-word is being trans- 
ferred and the second half of the transfer fails, only the 
second part will be repeated. 

The same applies for a retry during a burst sequence. The 
repeated cycle will begin where the read operation failed 
(rather than the first address of the burst) and will finish the 
original burst. 

Figures 3-27 and 4-10 (in Section 4) show the BRT timing 
for a basic access cycle and for burst cycles respectively. 
The CPU always wait s for BRT to be HIGH before repeating 
the bus cycle. While BRT is LOW, the CPU places all the 
output signals shown in Figure 4-11 in a TRI-STATE® condi- 
tion. 

Bus Error 

If a bus error is incorrectable the CPU may be requested to 
interrupt the current process and branch to an appropriate 
procedure to handl e the er ror. T he request is performed by 
activating the BER signal. BER is sampled by the CPU at 
the end of state T2 or T2B on the rising edge of BCLK. 


ANY 



FIGURE 3-26. Slave Processor Read Cycle 


When BER is sampled active, the CPU completes the bus 
cycle normally. If a bus error occurs during a bus cycle for a 
reference required to execute an instruction, then a bus er- 
ror exception is recognized. However, if an error occurs dur- 
ing an acknowledge cycle of another exception or during 
the ICU read cycle of a RETI instruction, the CPU interprets 
the event as a fatal bus error and enters the ‘halted’ state. 
In this state the CPU floats its address and data buses and 
places a special status code on the STO-4 lines. The CPU 
can exit this condition only through a hardware reset. Refer 
to Section 3.2.6 for more details on bus error. 

Note 1: If the erroneous bus cycle is exten ded by mean s of wait states, then 
the CPU uses the values of BRT and/or BER sampled during the 
last wait state. 

Note 2: If the CPU samples both BRT and BER active, BRT has higher 
priority. The bus error indication is ignored, and the bus cycle is 
repeated. 

Note 3: If BER is asserted during a bus cycle of a multi-cycle data transfer, 
the CPU completes the entire transfer normally, but the data will be 
ignored. The CPU also ignores any subsequent assertion of BER 
during the same data transfer. 

Note 4: Neither BRT nor BER should be asserted during the T2 state of a 
slave processor bus cycle. 

3.5.6 Dynamic Bus Configuration 

The NS32GX32 is tuned to operate with 32-bit wide memory 
and peripheral devices. The bus also supports 8-bit and 
1 6-bit data widths, but at reduced efficiency. The CPU can 
switch from one bus width to another dynamically; the only 
restriction is that the bus width cannot change for locations 
within an aligned 16-byte block. 

The CPU determines the bus width in effect for a bus cycle 
by using the values of the BWO and BW1 signals sampled 
during the last T2 state. Values of BWO and BW1 sampled 
before the last T2 state or during T2B states are ignored. 
Whenever a bus width other than 32-bit is detected by the 
CPU, two idle states are inserted before the next bus cycle 
is initiated. These idle states are only inserted once during 
an operand access, even if more than two bus cycles are 
needed to complete the access. 
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3.0 Functional Description (Continued) 
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FIGURE 3-27. Bus Retry During a Basic Read Cycle 
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3.0 Functional Description (Continued) 

The various combinations for BWO and BW1 are shown be- 
low. 


BW1 

BWO 


0 

0 

Reserved 

0 

1 

8-Bit Bus 

1 

0 

16-Bit Bus 

1 

1 

32-Bit Bus 


The bus width is always 32 bits during slave cycles (See 
Section 3.5.4.7). An important feature of the NS32GX32 is 
that it does not impose any restrictions on the data align- 
ment, regardless of the bus width. 

Bus accesses are performed in double-word units. Access- 
es of data operands that cross double-word boundaries are 
decomposed into two or more aligned double-word access- 
es. 

The CPU provides four byte enable signals (BEO-3) which 
facilitate individual byte accessing on either a 32-bit or a 
1 6-bit bus. 

Figures 3-28 and 3-29 show the basic interfaces for 32-bit 
and 1 6-bit memories. An 8-bit memory interface (not shown) 
is even simpler since it does not use any of the BEO-3 
signals and its single bank is always enabled whenever the 
memory is selected. Each byte location in this case is se- 
lected by address bits AO-31. 

The NS32GX32 does not keep track of the bus width used 
in previous instruction fetches or data accesses. At the be- 
ginning of every memory transaction, the_CPU always as- 
sumes that the bus is 32-bit wide and the BEO-3 signals are 
activated accordingly. 

The BOUT signal is also asserted during instruction fetches 
or data reads if the conditions for bursting are satisfied. If 
the bus is other than 32-bit wide, the BIN signal is ignored 
and BOUT is deasserted at the beginning of the T state 
following T2, since burst cycles are not allowed for 8-bit or 
1 6-bit buses. 


BE3 BE2 BE1 BEO 



Note: The CACH signal must be asserted during cacheable read accesses. 


The following subsections provide detailed descriptions of 
the access sequences performed in the various cases. 

Note: Although the NS32GX32 ignores the BIN signal for 8-bit and 16-bit 
bus widths, it is recommended that BIN be asserted only if the system 
supports burst transfers. This is to ensure compatibility with future 
versions of the CPU that might support burst transfers for 8-bit and 
16-bit buses. 



FIGURE 3-29. Basic Interface for 16-Bit Memories 
3.5.6. 1 1nstruction Fetch Sequences 

The CPU performs two types of instruction fetch cycles: se- 
quential and non-sequential. These can be distinguished 
from each other by the differing status combinations on pins 
STO-4. For non-sequential instruction fetches the CPU 
presents on the address bus the exact byte address of the 
first instruction in the instruction stream that is about to be- 
gin; for sequential instruction fetches, the address of the 
next aligned instruction double-word is presented on the ad- 
dress bus. The CPU always activates all byte enable signals 
(BEO- 3) for both sequential and non-sequential fetches. 
BOUT is also asserted during T2 if the addressed double- 
word is not the last in an aligned 16-byte block. Tables 3-5 
to 3-7 show the fetch sequence for the various bus widths. 
32-Bit Bus Width 

The CPU reads the entire double-word present on the data 
bus into its internal instruction buffer. 

If BOUT and BIN are both active, the CPU reads up to 3 
consecutive double-words using burst cycles. Burst cycles 
are used for instruction fetches regardless of whether the 
accesses are cacheable. 
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3.0 Functional Description (Continued) 

Example: JUMP @5 

• The CPU performs a fetch cycle at address 5 with BEO-3 
all active. 

• Two burst cycles ar e the n performed and addresses 8 and 
12 are output while BEO-3 are kept active. 

16-Bit Bus Width 

The word on the least-significant half of the data bus is read 
by the CPU. This is either the even or the odd word within 
the required instruction double-word, as determined by ad- 
dress bit 1 . 

The CPU then complements address bit 1, clears address 
bit 0 and initiat es a bus cycle to read the other word, while 
keeping all the BEO-3 signals active. 

These two words are then assembled into a double-word 
and transferred into the instruction buffer. 

In case of a non-sequential fetch, if the access is not cache- 
able and the instruction address selects the odd word within 
the instruction double-word, the even word is not fetched. 


Example JUMP @6 

• A fetch cycle is performed at address 6 with BEO-3 all 
active. 

• The word at address 4 is then fetched if the access is 
cacheable. 

8-Bit Bus Width 

The instruction byte on the bus lines DO-7 is fetched. The 
CPU performs three consecutive cycles to read the remain- 
ing bytes within the required double-word, while keeping 
BEO-3 all active. The 4 bytes are then assembled into a 
double-word and transferred into the instruction buffer. For 
a non-sequential fetch, if the access is not cacheable, the 
CPU will only read the upper bytes within the instruction 
double-word starting with the byte at the instruction ad- 
dress. 

Example: JUMP @7 

• The CPU performs a fetch cycle at address 7 with BEO-3 
all active. 

• Bytes at addresses 4, 5 and 6 are then fetched consecu- 
tively if the access is cacheable. 


TABLE 3-5. Cacheable/Non-Cacheable Instruction Fetches from a 32-Bit Bus 

1 . In a burst access four bytes are fetched with the L.S. bits of the address set to 00. 

2. A ‘C’ on the data bus refers to cacheable fetches and indicates that the byte is placed in the instruction cache. An T refers 
to non-cacheable fetches and indicates that the byte is ignored. 


Number 
of Bytes 


1 


2 


3 


Address 

LSB 

Bytes to be Fetched 

11 

B0 — — — 

10 

B1 B0 — — 

01 

B2 B1 B0 — 

00 

B3 B2 B1 B0 


Address 

Bus 



Data Bus 

B0 

C/I 

C/I 

C/I 

B1 

B0 

C/I 

C/I 

B2 

B1 

B0 

C/I 

B3 

B2 

B1 

B0 


TABLE 3-6. Cacheable/Non-Cacheable Instruction Fetches from a 16-Bit Bus 

1. A bus access marked with “’in the ‘Address Bus’ column is performed only if the fetch is cacheable. 


Address 

Bus 


Number 
of Bytes 

Address 

LSB 

Bytes to be Fetched 

1 

11 

B0 — — — 

2 

10 

B1 B0 — — 

3 

01 

B2 B1 B0 — 



Data Bus 

— 

— 

B0 

C/I 

— 

— 

C 

C 

— 

— 

B1 

BO 

— 

— 

C 

C 
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3.0 Functional Description (Continued) 


TABLE 3-7. Cacheable/Non-Cacheable Instruction Fetches from an 8-Bit Bus 


Number 
of Bytes 

Address 

LSB 

Bytes to be Fetched 

Address 

Bus 

BEO-3 

Data Bus 

1 

11 

B0 — — — 



— — — B0 






— — — C 






— — — C 




Bn 


— — — C 

2 

10 

B1 B0 — — 



_ _ _ B0 






— — — B1 






— — — C 




• M 


— — — C 

3 

01 

B2 B1 B0 — 

A 


— — — B0 




A + 1 


— — — B1 




A + 2 


— — — B2 




•A - 1 


— — — C 

4 

00 

B3 B2 B1 B0 

mam 

■ 

_ _ _ B0 





B 

— — — B1 




mm 

B 

— — — B2 




mam 


— — — B3 


16-Bit Bus Width 

The word on the least-significant half of the data bus is read 
by the CPU. The CPU can then perform another access 
cycle with address bit 1 complemented and address bit 0 
cleared to read the other word within the addressed double- 
word. 

If the access is cacheable, the entire double-word is read 
and stored into the cache. 

If the access is not cacheable, the CPU ignores the bytes in 
the double-word not selected by BEO-3. In this case, the 
second access cycle is not performed, unless selected 
bytes are contained in the second word. 

Example: MOVB @5, R0 

• The CPU reads a word at address 5 while keeping BE1 
active. 

• If the access is not cacheable, the CPU ignores byte 0. 

• If the access is cacheable. the CPU performs another ac- 
cess cycle, with BEO-3 all active, to read the word at 
address 6. 

8-Blt Bus Width 

The data byte on the bus lines DO-7 is read by the CPU. 
The CPU can then perform up to 3 access cycles to read 
the remaining bytes in the double-word. 

If the access is cacheable, the entire double-word is read 
and stored into the cache. 

If the access is not cacheable, the CPU will only perform 
those access cycles needed to read the selected bytes. 
Example: MOVW @5, RO 

• The CPU reads the byte at address 5 while keeping BE1 
and BE2 active. 

• If the access is not cacheable, the CPU activates BE2 and 
reads the byte at address 6. 

• If the accessjs cacheable, the CPU performs three bus 
cycles with BEO-3 all active, to read the bytes at address- 
es 6, 7 and 4. 


3. 5. 6. 2 Data Read Sequences 

The CPU starts a data read access by placing the exact 
address of the operand on the address bus. The byte en- 
able lines are activated to select only the bytes required by 
the instruction being executed. This prevents spurious ac- 
cesses to peripheral devices that might be sensitive to read 
accesses, such as those which exhibit the characteristic of 
destructive reading. If the on-chip data cache is internally 
enabled for the read access, the BOUT signal is asserted at 
the beginning of state T2. BOUT will be deasserted if the 
data cache is externally inhibited (through CNN or iODEC), 
or the bus width is other than 32 bits. During cacheable 
accesses the CPU always reads all the bytes in the double- 
word, whether or not they are needed to execute the in- 
struction, and stores them into the data cache. The external 
memory, in this case, must place the data on the bus re- 
gardless of the state of the byte enable signals. 

If the data cache is either internally or externally inhibited 
during toe access, the CPU ignores the bytes not selected 
by the BEO-3 signals. Data read sequences for the various 
bus widths are shown in tables 3-8 to 3-10. 

32-Blt Bus Width 

The entire double-word present on the bus is read by the 
CPU. If the access is cacheable and the memory allows 
burst accesses, the CPU reads up to 3 additional double- 
words within the aligned 16-byte block containing the first 
byte of the operand. These burst accesses are performed in 
a wrap-around fashion within the 16-byte block. 

Example: MOVW @5, RO 

• The CPU reads a double-word at address 5 while keeping 
BE1 and BE2 active. 

• If the access is not-cacheable, BOUT is deasserted and 
the data bytes 0 and 3 are ignored. 

• If the access is cacheable, the CPU performs burst cycles 
with BEO-3 all active, to read the double-words at ad- 
dresses 8, 12, and 0. 
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3.0 Functional Description (Continued) 


TABLE 3-8. Cacheable/Non-Cacheable Data Reads from a 32-Blt Bus 

1. In a burst access four bytes are read with the L.S. bits of the address set to 00. 

2. A ‘C’ on the data bus refers to cacheable reads and indicates that the byte is placed in the data cache. An T refers to non- 
cacheable reads and indicates that the byte is ignored. 


Number 
of Bytes 


1 


1 


1 


1 


2 


2 


2 


3 


3 


Address 

LSB 

Bytes to be Read 

00 

— 

— 

— 

BO 

01 

— 

— 

B0 

— 

10 

— 

B0 

— 

— 

11 

BO 

— 

— 

— 

00 

_ 

— 

B1 

BO . 

01 

— 

B1 

BO 

— 

10 

B1 

B0 

— 

— 

00 

— 

B2 

B1 

BO 

01 

B2 

B1 

BO 

— 

00 

B3 

B2 

B1 

BO 


Address 

Bus 



Data Bus 

C/I 

C/I 

C/I 

BO 

C/I 

C/I 

BO 

C/I 

C/I 

BO 

C/I 

C/I 

BO 

C/I 

C/I 

C/I 

C/I 

C/I 

B1 

BO 

C/I 

B1 

BO 

C/I 

B1 

BO 

C/I 

C/I 

C/I 

B2 

B1 

BO 

B2 

B1 

BO 

C/I 

B3 

B2 

B1 

BO 


TABLE 3-9. Cacheable/Non-Cacheable Data Reads from a 16-Bit Bus 

1. A bus access marked with in the ‘Address Bus’ column is performed only if the read is cacheable. 


BEO-3 


of Bytes 


1 


Address 

LSB 

Data to be Read 

00 

— 

— 

— 

BO 

01 

— 

— 

BO 

— 

10 

— 

BO 

— 

— 

11 

BO 

— 

— 

— 

00 

— 

— 

B1 

BO 

01 

— 

B1 

BO 

— 

10 

B1 

BO 

— 

— 

00 

— 

B2 

B1 

BO 

01 

B2 

B1 

BO 

— 

00 

B3 

B2 

B1 

BO 


Address 

Bus 



B1 BO 
B3 B2 
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3.0 Functional Description (Continued) 


TABLE 3-10. Cacheable/Non-Cacheable Data Reads from an 8-Bit Bus D8-12 


Number Address 

of Bytes LSB 


Data to be Read 



3.S.6.3 Data Write Sequences 

In a write access the CPU outputs the operand address and 
asserts only the byte enable lines needed to select the spe- 
cific bytes to be written. 

In addition, the CPU duplicates the data to be written on the 
appropriate bytes of the data bus in order to handle 8-bit 
and 16-bit buses. 

The various access sequences as well as the duplication of 
data are summarized in tables 3-1 1 to 3-13. 


32-Bit Bus Width 

The CPU performs only one access cycle to write the se- 
lected bytes within the addressed double-word. 

Example: MOVB R0, @6 

• The CPU duplicates byte 2 of the data bus into byte 0 and 
performs a write cycle at address 6 with BE2 active. 

16-Bit Bus Width 

Up to two access cycles are needed to complete the write 
operation. 
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3.0 Functional Description (continued) 








Example: MOVW R0, @5 

• The CPU duplicates byte 1 of the data bus into byte 0 and 
performs a write cycle at address 5 with BE1 and BE2 
active. 

• A write at address 6 is then performed with BE2 active 
and the original byte 2 of the data bus placed on byte 0. 

8-Bit Bus Width 

Up to 4 access cycles are needed in this case to complete 

the write operation. 

Example: MOVB R0, @7 

• The CPU duplicates byte 3 of the data bus into bytes 0 
and 1, and then performs a write cycle at address 7 with 

BE3 active. 

3.5.7 Bus Access Control 

The NS32GX32 has the capability of relinquishing its control 

of the bus upon request from a DMA device or another CPU. 

This capability is implemented with the HOLD and HLDA 

signals. By asserting HOLD, an external device requests ac- 
cess to the bus. On receipt of HLDA from the CPU, the 
device may perform bus cycles, as the CPU at this point has 
placed all the output signals shown in Figure 3-30 into the 
TRI-STATE condition. 

To return control of the bus to the CPU, the external device 
sets HOLD inactive, and the CPU acknowledges return of 
the bus by setting HLDA inactive. 

The CPU samples HOLD in the middle of each T-state on 
the falling edge of BCLK. If HOLD is asserted when the bus 
is idle between access sequences, then the bus is granted 
immediately (see Figure 3-29). If HOLD is asserted during 
an access sequence, then the bus is granted immediately 
after the access sequence, including any retried bus cycles, 
has completed (see Figure 4-13). Note that an access se- 
quence can be composed of several bus cycles if the bus 
width is 8 or 16 bits. 




TABLE 3-11. Data Writes to a 32-Bit Bus 





1. Bytes on the data bus marked with are undefined. 








Number 
of Bytes 

Address 

LSB 

Data to be Written 

Address 

Bus 

BEO-3 

Data Bus 

1 

00 

— 

— 

— 

BO 

A 

HHHL 

• 

• 

• 

BO 

1 

01 

— 

— 

B0 

— 

A 

HHLH 

• 

• 

BO 

BO 

1 

10 

— 

B0 

— 

— 

A 

HLHH 

• 

BO 

• 

BO 

1 

11 

B0 

— 

— 

— 

A 

LHHH 

BO 

• 

BO 

BO 

2 

00 

— 

— 

B1 

BO 

A 

HHLL 

• 

• 

B1 

BO 

2 

01 

— 

B1 

B0 

— 

A 

HLLH 

• 

B1 

BO 

BO 

2 

10 

B1 

B0 

— 

— 

A 

LLHH 

B1 

BO 

B1 

BO 

3 

00 

— 

B2 

B1 

BO 

A 

HLLL 

• 

B2 

B1 

BO 

3 

01 

B2 

B1 

BO 

— 

A 

LLLH 

B2 

B1 

BO 

BO 

4 

00 

B3 

B2 

B1 

BO 

A 

LLLL 

B3 

B2 

B1 

BO 

TABLE 3-12. Data Writes to a 16-Bit Bus 

Number 
of Bytes 

Address 

LSB 

Data to be Written 

Address 

Bus 

BEO-3 

Data Bus 

1 

00 

— 

— 

— 

BO 

A 

HHHL 

• 

• 

• 

BO 

1 

01 

— 

— 

BO 

— 

A 

HHLH 

• 

• 

BO 

BO 

1 

10 

— 

B0 

— 

— 

A 

HLHH 

• 

BO 

• 

BO 

1 

11 

B0 

— 

— 

— 

A 

LHHH 

BO 

• 

BO 

BO 

2 

00 

— 

— 

B1 

BO 

A 

HHLL 

• 

• 

B1 

BO 

2 

01 

— 

B1 

BO 

— 

A 

HLLH 

• 

B1 

BO 

BO 







A + 1 

HLHH 

• 

• 

• 

B1 

2 

10 

B1 

B0 

— 

— 

A 

LLHH 

B1 

BO 

B1 

BO 

3 

00 

— 

B2 

B1 

BO 

A 

HLLL 

• 

B2 

B1 

BO 







A + 2 

HLHH 

• 

• 

• 

B2 

3 

01 

B2 

B1 

BO 

— 

A 

LLLH 

B2 

B1 

BO 

BO 







A + 1 

LLHH 

• 

• 

B2 

B1 

4 

00 

B3 

B2 

B1 

BO 

A 

LLLL 

B3 

B2 

B1 

BO 







A + 2 

LLHH 

• 

• 

B3 

B2 

1 
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3.0 Functional Description (Continued) 

TABLE 3-13. Data Write8 to an 8-Blt Bus 


Number 
of Bytes 


1 


1 


1 


1 


2 



Address 

LSB 

Data to be Written 

00 

— 

— 

— 

BO 

01 

— 

— 

B0 

— 

10 

— 

B0 

— 

— 

11 

B0 

— 

— 

— 

00 

— 

— 

B1 

BO 

01 

— 

B1 

BO 

— 

10 

B1 

B0 

— 

— 

00 


B2 

B1 

BO 

01 

B2 

B1 

BO 

— 







Address 

Bus 



B2 

B1 

BO 

• 

• 

B1 

• 

• 

B2 



































































3.0 Functional Description (Continued) 



TL/EE/1 0253-37 

FIGURE 3-30. Hold Acknowledge. (Bus Initially Idle.) 

Note: The status indicates ‘IDLE’ while the bus is granted. If the cause of the IDLE changes (e.g., CPU starts waiting for an interrupt), the status also changes. 

The CPU will never grant the bus between interlocked read 
and write bus cycles. 

Note: If an external device requires a very short latency to get control of the 
bus, the bus retry signal (BRT) can be used instead of hold. See 
Section 3.5.5. 

3.5.8 Interfacing Memory-Mapped I/O Devices 

In Section 3. 1.3.2 it was mentioned that some special pre- 
cautions are needed when interfacing I/O devices to the 
NS32GX32 due to its internal pipelined implemen tation. 

Two specia l signals are pro vided f or this purpose: IOINH 
and IODEC. The CPU asserts IOINH during a read bus cycle 
to indicate that the bus cycle should be ignored if an I/O 
device is selected. The system responds by asserting 
IODEC to indicate to the CPU that an I/O device has been 
selected. IODEC is sampled by the CPU in the middle of 


state T2. If the cycle is extended, then the CPU uses the 
IODEC value sampled during the last wait state. If a bus 
error or a bus retry occurs, the sampled IODEC value is 
ignored. IODEC must be kept high during burst transfer cy- 
cles. 

When IODEC is active during a bus cycle for which IOINH is 
asserted, the CPU discards the data and applies the special 
handling required for I/O devices. Figure 3-31 shows a pos- 
sible implementation of an I/O device interface where the 
address mapping of the I/O devices is fixed. 

In an open system configuration, IODEC could be generated 
by the decoding logic of each I/O device subsystem. 

Note 1: When IODEC is active in response to a read bus cycle, the CPU 
treats the reference as noncacheable. 

Note 2: IOINH is kept inactive during write cycles. 
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3.0 Functional Description (Continued) 



TL/EE/1 0253-38 

FIGURE 3-31. Typical I/O Device Interface 


3.5.9 Interrupt and Debug Trap Requests 

Three signals are provided by the CPU t o ext ernally request 
interrupts and/or a debug trap. INT and NMI a re for maska- 
ble and non-maskable interrupts respectively. DBG is used 
for requesting an external debug trap. 

The CPU samples TNT and NMI on every other rising edge 
of BC LK, starting with the second rising edge of BCLK after 
RST goes high. 

NMI is edge-sensitive; a high-to-low transition on it is detect- 
ed by the CPU and stored in an internal latch, so that there 
is no need to keep it asserted until it is acknowledged. 

INT is level-sensitive and, as such, once asserted, it must 
be kept asserted until it is acknowledged. 

The DBG signal, like NMI, is edge-sensitive; it differs from 
NMI in that the CPU samples it on each rising edge of 
BCLK. DBG can be asserted asynchronously to the CPU 
clock, but it should be at least 1.5 clock cycles wide in order 
to be recognized. 

If DBG meets the specified setup and hold times, it will be 
recognized on the rising edge of BCLK deterministically. 
Refer to Figures 4-19 and 4-20 for more details on the tim- 
ing of the above signals. 

Note: If the FM signal is pulsed to request a non-maskable interrupt, it may 
be necessary to keep it asserted for a minimum of two clock cycles to 
guarantee its detection, unless extra logic ensures that the pulse oc- 
curs around the BCLK sampling edge. 


3.5.10 Internal Status 

The NS32GX32 provides information on the system inter- 
face concerning its internal activity. 

The U/S signal will indicate the state of the U bit in the PSR 
except in the following cases: 

While executing a MOVUS instruction it will be ‘1’ during the 
source read. 

While executing a MOVSU instruction it will be ‘1’ during the 
destination write. 

The PFS signal is asserted for one BCLK cycle when the 
CPU begins executing a new instruction. The ISF signal is 
driven High along with PFS if the new instruction does not 
follow the previous instru ction in sequence. More specifical- 
ly, ISF is High along with PFS after processing an exception 
or after executing one of the following instructions: ACB 
(branch taken), Bcond (branch taken), BR, BSR, CASE, 
CXP, CXPD, DIA, JSR, JUMP, RET, RETT, RETI, and RXP. 
The BP signal is asserted for one BCLK cycle when an ad- 
dress-compare or PC-match condition is d etect ed. If the BP 
signal is asserted one BCLK cycle after PFS, it indicates 
that an address-compare debug condition has been detect- 
ed. If BP is asserted at any other time, it indicates that a PC- 
Match debug condition has been detected. 

While executing a CINV instruction, the CPU displays the 
operation code and source operand using slave processor 
write bus cycles. 

During idle bus cycles, the signals ST0-ST4 indicate wheth- 
er the CPU is waiting for an interrupt, waiting for a Slave 
Processor to complete executing an instruction or halted. 
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4.1 NS32GX32 PIN DESCRIPTIONS 

Descriptions of the NS32GX32 pins are given in the follow- 
ing sections. 

Included are also references to portions of the functional 
description, Section 3. 

Figure 4-1 shows the NS32GX32 interface signals grouped 
according to related functions. 

Note: An asterisk next to th e sign al name indicates a TRI-STATE condition 
for that signal when HOLD is acknowledged or during an extended 
retry. 

4.1.1 Supplies 

VCCL1-6 Logic Power. 

+ 5V positive supplies for on-chip logic. 
VCCB1- 1 4 Buffers Power. 

+ 5V positive supplies for on-chip output 
buffers. 

VCCCLK Bus Clock Power. 

+ 5V positive supply for on-chip clock driv- 
ers. 

GNDL1-6 Logic Ground. 

Ground references for on-chip logic. 
GNDB1-13 Buffers Ground. 

Ground references for on-chip output buffers. 
GNDCLK Bus Clock Ground. 

Ground reference for on-chip clock drivers. 


4.1.2 Input Signals 
CLK Clock. 

Input Clock used to derive all CPU Timing. 
SYNC Synchronize. 

When SYNC is active, BCLK will stop tog- 
gling. This signal can be used to synchronize 
two or more CPUs (Section 3.5.2). 

HOLD Hold Request. 

When active, causes the CPU to release the 
bus for DMA or multiprocessing purposes 
(Section 3.5.7). 

Note: 

If the HOLD signal is generated asynchronously, its set 
up and hold times may be violated. In this case it is rec- 
ommended to synchronize it with the falling edge of 
BCLK to minimize the possibility of metastable states. 
The CPU provides only one synchronization stage to min- 
imize the HLDA latency. This is to avoid speed degrada- 
tions in cases of heavy HOLD activity (i.e. DMA controller 
cycles interleaved with CPU cycles). 

RST Reset. 

When RST is active, the CPU is initialized to 
a known state (Section 3.5.3). 

INT Interrupt. 

A low level on this signal requests a maska- 

ble interrupt (Section 3.5.9). 

NMI Nonmaskable Interrupt. 

A High-to-Low transition of this signal re- 
quests a nonmaskable interrupt (Section 
3.5.9). 
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4.0 Device Specifications (continued) 



DBG 

Debug Trap Request. 

4.1.3 Output Signals j 


A High-to-Low transition of this signal re- 
quests a debug trap (Section 3.5.9). 

BCLK 

Bus Clock. 

Output clock for bus timing (Section 3.5.2). 

CIIN 

Cache Inhibit In. 

When active, indicates that the location refer- 

BCLK 

Bus Clock Inverse. 

Inverted output clock. 


enced in the current bus cycle is not cache- 
able. CIIN must not change within an aligned 
16-byte block. 

HLDA 

Hold Acknowledge. 

Activated by the CPU in response to the 

IODEC 

I/O Decode. 


HOLD input to indicate that the CPU has re- 
leased the bus. 


Indicates to the CPU that a peripheral device 
is addressed by the current bus cycle. The 
value of IODEC must not change within an 
aligned 16-byte block (Section 3.5.8). 

PFS 

Program Flow Status. 

A pulse on this signal indicates the beginning 
of execution for each instruction (Section 

FSSR 

Force Slave Status Read. 


3.5.10). 


When asserted, indicates that the slave 
status word should be read by the CPU (Sec- 
tion 3.1. 4.1). An external 10 kfi resistor 
should be connected between FSSR and 

ISF 

Internal Sequential Fetch. 

Indicates along with PFS that the instruction 
beginning execution is sequential (ISF Low) 
or non-sequential (ISF High). 


Vcc- 

U/S 

User/Supervisor. 

SDN 

Slave Done. 

Used by a slave processor to signal the com- 


User or supervisor mode status (Section 
3.5.10). 


pletion of a slave instruction (Section 

BP 

Break Point. 


3.1. 4.1). An external 10 kfl resistor should be 
connected between SDN and Vcc- 


This signal is activated when the CPU de- 
tects a PC or operand-address match debug 

BIN 

Burst In. 


condition (Section 3.3.2). 


When active, indicates to the CPU that the 
memory supports burst cycles (Section 

3. 5.4.3). 

CASEC 

* Cache Section. 

For cacheable data read bus cycles indicates 
the Section of the on-chip Data Cache where 

RDY 

Ready. 

While this signal is not active, the CPU ex- 


the data will be placed; undefined for other 
bus cycles. 


tends the current bus cycle to support a slow 
memory or peripheral device. 

IOINH 

I/O Inhibit. 

Indicates that the current bus cycle should 

BW0-1 

Bus Width. 

These lines define the bus width (8, 16 or 32 


be ignored if a peripheral device is ad- 
, dressed. 


bits) for each data transfer; BWO is the least 
significant bit. The bus width must not 

SPC 

Slave Processor Control. 

Data strobe for slave processor transfers. 


change within an aligned 16-byte block — en- 
codings are: 

00— Reserved 

BOUT 

* Burst Out. 

When active, indicates that the CPU is re- 
questing to perform burst cycles. 


01 — 8 Bits 

10— 16 Bits 

11— 32 Bits 

ILO 

Interlocked Operation. 

When active, indicates that interlocked cy- 
cles are being performed (Section 3.5.4.5). 

BRT 

Bus Retry. 

When active, the CPU will reexecute the last 
bus cycle (Section 3.5.5). 

DDIN 

'Data Direction. 

Indicates the direction of a data transfer. It is 
low for reads and high for writes. 

BER 

Bus Error. 

When active, indicates that an error occurred 
during a bus cycle. It is treated by the CPU as 

CONF 

'Confirm Bus Cycle. 

When active, indicates that a bus cycle initia- 
ted by ADS is valid; that is, the bus cycle has 


the highest priority exception after reset. 


not been cancelled (Section 3.5.4.2). 
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4.0 Device Specifications (continued) 

BMT ’Begin Memory Transaction. 

01000 — Sequential Instruction Fetch. 

When Stable Low indicates that the current 

01001- 

-Non-Sequential Instruction Fetch. 

bus cycle is valid; that is, the bus cycle has 

01010- 

-Data Transfer. 

not been cancelled (Section 3.5.4.2). 

01011- 

-Read Read-Modify-Write Operand. 

ADS ’Address Strobe. 

01100 — Read for Effective Address. 

When active, indicates that a bus cycle has 
begun and a valid address Is on the address 
bus. 

01101 ' 

• 


BEO-3 ’Byte Enables. 

• 

- Reserved. 

Used to selectively enable data transfers on 

• 


bytes 0-3 of the data bus. 

11100 „ 


STO-4 Status. 

11101— 

-Transfer Slave Operand. 

Bus cycle status code; ST0 is the least signif- 

11110- 

-Read Slave Status Word. 

icant. Encodings are: 

Hill- 

-Broadcast Slave ID. 

00000— Idle: CPU Inactive on Bus. 

AO-31 ’Address Bus. 

00001 — Idle: WAIT Instruction. 

Used by the CPU to output a 32-bit address 

00010— Idle: Halted. 

at the beginning of a bus cycle. A0 is the 

00011 — Idle: The bus is idle while the slave 

least significant. 

processor is executing an instruction. 

4.1.4 Input/Output Signals 

00100— Interrupt Acknowledge, Master. 

D0-31 ’Data Bus. 

00101— Interrupt Acknowledge, Cascaded. 

Used by the CPU to input or output data dur- 

00110 — End of Interrupt, Master. 

ing a read or write cycle respectively. 

001 1 1— End of Interrupt, Cascaded. 

4.2 ABSOLUTE MAXIMUM RATINGS 

All Input or Output Voltages with 

If Military/Aerospace specified devices are required, 

Respect to GND 

-0.5V to +7V 

please contact the National Semiconductor Sales 

Power Dissipation 

4 W 

Office/Distributors for availability and specifications. 

Note: Absolute maximum ratings indicate limits beyond 

Case Temperature Under Bias 0°C to + 95°C 

which permanent damage may occur. Continuous operation 

Storage T emperature - 65°C to + 1 50°C 

at these limits is not intended; operation should be limited to 

those conditions specified under Electrical Characteristics. 

4.3 ELECTRICAL CHARACTERISTICS NS32GX32-20, 25: T C ASE 

= 0° to + 95°C, Vcc = 

5V ±10%, GND = 0 V 

NS32GX32-30: T C ase = 

0“ to +95°C, V C c = 5V ±5%, GND = 0V. 


Symbol 

Parameter 

V|H 

High Level Input Voltage 

V| L 

Low Level Input Voltage 

VOH 

High Level Output Voltage 

VOL 

Low Level Output Voltage 

A0-11, DO-31, DDIN 

CONF, BMT 

BCLK, BCLK 

All Other Outputs 

II 

Input Load Current 

II 

Leakage Current (Output and 

I/O pins in TRI-STATE/Input Mode) 

C|N 

CLK Input Capacitance 

!CC 

Active Supply Current 



Iqh = -400 fi, A 


!ol = 4 mA 
Iol = 6 mA 
Iol = 16 mA 
Iol = 2 mA 


0 £ V|n ^ Vqc 


0.4 ^ V|n ^ Vcc 


•OUT = 0. Ta = 25°C 
V CC = 5V 




700 @ 30 MHz 
600 @ 25 MHz 
470 @ 20 MHz 


800 @ 30 MHz 
700 @ 25 MHz 
575 @ 20 MHz 
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4.0 Device Specifications (Continued) 
Connection Diagram 
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FIGURE 4-2. 175-Pin PGA Package 
NS32GX32 Pinout Descriptions 
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NS32GX32 


Desc 


Desc 

Reserved 

raj 

D26 

Reserved 

£S 

Reserved 

Reserved 

in 

Reserved 

BP 

A4 

VCCL2 

1SF 

A5 

Reserved 

RST 

A6 

PFS 

NMI 

A7 

SDN 

GNDB1 

A8 

Reserved 

Reserved 

A9 

BCLK 

VCCB2 

A10 

VCCCLK 

Reserved (2) 

All 

SynS 

Reserved (1) 

A12 

Reserved (2) 

Reserved (2) 

A13 

Reserved (2) 

Reserved (2) 

A14 

VCCL6 

VCCB1 

A15 

D29 

Reserved 

B1 

D27 

VCCB4 

B2 

D25 

Reserved 

B3 

U/S 

Reserved 

B4 

Reserved 

VCCB3 

B5 

Reserved 

F55R 

B6 

GNDL3 

INT 

B7 

GNDB2 

VCCL1 

B8 

DBS 

GNDL2 

B9 

Reserved 

Reserved (2) 

BIO 

BCLK 

Reserved (2) 

B11 

GNDCLK 

Reserved (2) 

B12 

CLK 

Reserved (2) 

B13 

Reserved (2) 

D30 

B14 

D31 

D28 

B15 

GNDL1 


Desc 


TL/EE/10253-40 


B16 GNDB13 D14 
Cl VCCB14 D15 
C2 D23 D16 

C3 IOINH 

C4 its 

C5 GNDB3 
C6 
C7 
C8 
C9 , 

CIO CASEC F2 
Cl 1 Reserved F3 
Cl 2 D21 FI 4 
Cl 3 D19 F15 

C14 D18 F16 

Cl 5 A29 G1 

C16 A31 G2 

D1 VCCB5 G3 
D2 GNDB12 G14 
D3 D17 G15 

D4 D16 G16 

D5 A27 HI 

D6 A28 H2 

D7 GNDB4 H3 
D8 VCCB13 HI 4 
D9 D15 H15 

DIO D14 H16 

Dll A26 J1 

D12 A25 J2 

D13 A24 J3 | 


GNDL6 

VCCL5 

D13 

VCCB6 

A23 

GNDL4 

GNDB11 

Dll 

D12 


VCCL3 

D8 

D9 

DIO 

A20 

GNDB5 

A17 

D5 

D7 

VCCB12 

A19 

A18 

A14 

All 

VCCB8 

GNDB7 

ST4 

HLDA 


GNDL5 

CGNF 


RDY 

FR5ED 

VCCB11 

GNDB10 

D4 

D6 

A16 

VCCB7 

GNDB6 

A10 

A6 

A2 

STS 

GNDB8 

VCCL4 

BEl 

GNDB9 

BWO 

BIN 

Reserved 

DO 

D3 

A15 

A12 

A9 

A7 

A4 


Pin Desc 

N9 AO 

N10 VCCB9 

Nil Rese rved 

N12 SPC 

N13 BE3 RIO 

N14 VCCB10 R11 

N15 ADS R12 

N16 BW1 R13 

PI BER R14 

P2 CUN R15 

P3 D2 R16 

P4 A13 SI 

P5 A8 S2 

P6 A5 S3 

P7 A3 S4 

P8 A1 S5 

P9 ST2 S6 

P10 ST1 S7 

P11 STO S8 

PI 2 SCOT S9 

Pi 3 B0IFJ S10 

P14 BE2 S11 

PI 5 BE6 S12 

PI 6 BitfT S13 

R1 BRT S14 

R2 IODEC SI 5 

R3 D1 SI 6 


Note 1: This pin should be grounded. 

Note 2: This pin should be connected to logical high. 
All other reserved pins should be left open. 
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4.0 Device Specifications (Continued) 

4.4 SWITCHING CHARACTERISTICS 
4.4.1 Definitions 

All the timing specifications given in this section refer to 
0.8V or 2.0V on all the signals as illustrated in Figures 4-3 
and 4-4, unless specifically stated otherwise. 




— 2.4V 

'SKSI h 
tsiGlv 





— 0.45V 

t SK32v 


— -2.4V 


^SIG2h 





0.45V 


TL/EE/10253-41 

FIGURE 4*3. Output Signals Specification Standard 


ABBREVIATIONS: 

L.E. — leading edge R.E. — rising edge 
T.E.— training edge F.E. — falling edge 




> 

(- 2.0V 
^ 0.8V 



o nv\ tsiG1 * r 

‘S'GIH r / O0V 

0.0V \ 






/ t S!G2a 

A ^ V 

t SIG2h \ 


L ' \ . 0.45V 

TL/EE/1 0253-42 

FIGURE 4*4. Input Signals Specification Standard 
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4.0 Device Specifications (Continued) 

4.4.2 Timing Tables 

4.4.2. 1 Output Signals: Internal Propagation Delays, NS32GX32-20, NS32GX32-25, NS32GX32-30 

• Maximum times assume capacitive loadi ng of 1 00 pF on the clock signals and 50 pF on ail the other signals. A minimum 
capacitance load of 50 pF on BCLK and BCLK is also assumed. 

• The output to input timings (e.g., Address to RDY, Address to BER, etc.) are at least 2 ns better than the worst case values 
calculated from the output valid and input setup times relative to BCLK or BCLK. 


Description Reference/Conditions 


Bus Clock Period R.E., BCLK to Next 
R.E., BCLK 


Name 

Figure 

l BC p 

4-24 

t BC h 

4-24 

tBC| 

4-24 

*BC r 
(Note 1) 

4-24 

*BCf 
(Note 1) 

4-24 

tNBCh 

4-24 

tNBC| 

4-24 

tNBC r 
(Note 1 ) 

4-24 

tNBCf 
(Note 1) 

4-24 

fCBCdr 

4-24 

tCBCd, 

4-24 

tCNBCdr 

4-24 

tCNBCdf 

4-24 

tBCNBCrf 

(Notel) 

4-24 

tBCNBCf r 
(Note 1) 

4-24 

w 

ksb 

<Ah 

KBB 

Uf 

4-11,4-12 

^Anf 

4-11,4-12 


NS32GX32-20 

NS32GX32-25 

NS32GX32-30 

Units 

Min 

Max 

Min 

Max 

Min 

Max 


50 

100 

40 

100 

33.3 

100 

ns 


At 2.0V on BCLK 
(Both Edges) 


At 0.8V on BCLK 
(Both Edges) 


0.8V to 2.0V on 
R.E., BCLK 


2.0V to 0.8V on 
F.E., BCLK 


At 2.0V on BCLK 
(Both Edges) 


At 0.8V on BCLR 
(Both Edges) 


0.8V to 2.0V on 


R.E., BCLK 


2.0V to 0.8V on 


F.E., BCLK 



BCLK High Time 


BCLK Low Time 


BCLK Rise Time 


BCLK Fall Time 


BCLK High Time 


BCLK Low Time 


BCLK Rise Time 


BCLK Fall Time 


CLK to BCLK 
R.E. Delay 


CLK to BCLK 
F.E. Delay 


CLK to BCLK 
R.E. Delay 


CLK to BCLK 
F.E. Delay 


Not Floating 


Note t: Guaranteed by characterization. Due to tester conditions, this parameter is not 100% tested. 
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4.0 Device Specifications (Continued) 

4.4.2. 1 Output Signals: Internal Propagation Delays, NS32GX32-20, NS32GX32-25, NS32GX32-30 (Continued) 


Name 

Figure 

Description 

Reference/Conditions 

NS32GX32-20 

NS32GX32-25 

NS32GX32-30 

Units 

Min 

Max 

Min 

Max 

Min 

Max 

l AB v 

4-8 

Address Bits A2, A3 
Valid (Burst Cycle) 

After R.E., BCLK T2B 


11 


9 


8 

ns 

*AB h 

4-8 

Address Bits A2, A3 
Hold (Burst Cycle) 

After R.E., BCLKT2B 

0 


0 


0 


ns 

tDO v 

4-6,4-15 

Data Out Valid 

After R.E., BCLKT1 


0-5 t B c p 
+ 13 


0-5 t BCp 
+ 12 


0-5 t B c p 

+ 11 

ns 

vm 

4-6,4-15 

Data Out Hold 

After R.E., BCLKT1 orTi 

0 


0 


0 


ns 

‘DOspc 

4-15 

Data Out Setup 
(Slave Write) 

Before SPC T.E. 

8 

■ 

6 

■ 

5 


ns 

l DO f 

H 

Data Bus Floating 

After R.E., BCLK 

T1 orTi 


21 


17 


13 

ns 

l DO nf 

m 

Data Bus 

Not Floating 

After F.E., BCLK T1 

0 


0 


0 


ns 



BMT Signal Valid 

After R.E., BCLKT1 


32 


27 


23 

ns 

{ BMTh 


BMT Signal Hold 

After R.E., BCLK T2 

0 


0 


0 


ns 

‘BMTf 

4-11,4-12 

BMT Signal Floating 

After F.E., BCLK Ti 


21 


17 


13 

ns 

tBMThf 

4-11,4-12 

BMT Signal 

Not Floating 

After F.E., BCLK Ti 

0 


0 


0 


ns 

*CONF a 

4-5, 4-8 

CONF Signal Active 

After R.E., BCLK TI 

0.5t B c p 

0.5 t BC 
+ 11 P 

0.5 t BCp 

0.5 t B c p 
+ 9 

05 t B c p 

0- 5 tBCp 
+ 8 

ns 


4-5, 4-8 

CONF Signal Inactive 

After R.E., BCLKT1 orTi 


11 


9 


8 

ns 

tC0NFf 

4-11,4-12 

CONF Signal Floating 

After F.E., BCLK Ti 


21 


17 


13 

ns 

tCONFnj 

4-11,4-12 

CONF Signal 

Not Floating 

After F.E., BCLKTi 

0 


0 


0 


ns 

l ADS a 

4-5, 4-8 

ADS Signal Active 

After R.E., BCLK TI 


11 


9 


8 

ns 

‘ADSjg 

4-5, 4-8 

ADS Signal Inactive 

After F.E., BCLKTI 


11 


9 


8 

ns 

tADS w 

4-6 

ADS Pulse Width 

At 0.8V (Both Edges) 

15 


12 


9 


ns 

*ADSf 

4-11,4-12 

ADS Signal Floating 

After F.E., BCLKTi 


21 


17 


13 

ns 

‘ADSnf 

4-11,4-12 

ADS Signal 

Not Floating 

After F.E., BCLK Ti 

0 


0 


0 


ns 

mm 

4-6, 4-8 

BE n Signals Valid 

After R.E., BCLKTI 


11 


9 


8 

ns 

tBE h 

4-6, 4-8 

BE n Signals Hold 

After R.E., BCLKTI, 

Ti or T2B 

0 


0 


0 


ns 

IBM 

4-11,4-12 

BE n Signals Floating 

After F.E., BCLK Ti 


21 


17 


13 

ns 

l BE n f 

4-11,4-12 

BE n Signals 

Not Floating 

After F.E., BCLK Ti 

0 


0 


0 


ns 



DDIN Signal Valid 

After R.E., BCLKTI 


11 


9 


8 

ns 

l DDINh 


DDIN Signal Hold 

After R.E., BCLKTI orTi 

0 


0 


0 


ns 

{ DDINf 

4-11,4-12 

DDIN Signal Floating 

After F.E., BCLK Ti 


21 


17 


13 

ns 

^DDIN n f 

4-11,4-12 

DDIN Signal 

Not Floating 

After F.E., BCLK Ti 

0 


0 


0 


ns 


4-14,4-15 

SPC Signal Active 

After R.E., BCLKTI 


19 


15 


12 

ns 
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4.0 Device Specifications (Continued) 

4.4.2. 1 Output Signals: Internal Propagation Delays, NS32GX32-20, NS32GX32-25, NS32GX32-30 (Continued) 



Description 


Reference/Conditions 


NS32GX32-20 NS32GX32-25 NS32GX32-30 


Min Max Min Max Min Max 


4-14,4-15 SPC Signal Inactive After R.E., BCLK Ti, T 1 orT2 


tDDSPC 4-14 
(Note 1) 


DDIN Valid to 
SPC Active 


Before SPC L.E. 


4-12,4-13 HLDA Signal Active After F.E., BCLK Ti 


4-12 HLDA Signal Inactive After F.E., BCLKTi 


4-5, 4-14 Status (STO-4) Valid After R.E., BCLKTI 


4-5,4-14 Status (STO-4) Hold After R.E., BCLKTI orTi 


4-8, 4-9 BOUT Signal Active After R.E., BCLK T2 


4-8, 4-9 BOUT Signal Inactive After R.E., BCLK 

LastT2B, TI orTi 


teouTf 1 4-11, 4-1 2 1 BOUT Signal Floating | After F.E., BCLKTi 



tBOUT n f 4-11,4-12 BOUT Signal 
Not Floating 


After F.E., BCLKTi 



Interlock Signal Active After F.E., BCLK Ti 


Interlock Signal Inactive After F.E., BCLK Ti 


PFS Signal Active 


PFS Signal Inactive 


4-22 ISF Signal Active 


4-22 ISF Signal Inactive 


BP Signal Active 


BP Signal Inactive 


U/S Signal Valid 


U/S Signal Hold 


After F.E., BCLK 


After F.E., Next BCLK 


After F.E., BCLK 


After F.E., Next BCLK 


After F.E., BCLK 


After F.E., Next BCLK 


After R.E., BCLKTI 


After R.E., BCLK TI or Ti 


CASEC Signal Valid | After F.E., BCLK TI 


CASEC Signal Hold I After R.E., BCLK TI or Ti 


tcASf 4-11, 4-12 CASEC Signal Floating After F.E., BCLKTi 


tCAS nf 4-11,4-12 CASEC Signal After F.E., BCLKTi 

Not Floating 



IOINH Signal Valid 


IOINH Signal Hold 


After R.E., BCLK TI 


After R.E., BCLKTI orTi 



Note 1: Guaranteed by characterization. Due to tester conditions, this parameter is not 100% tested. 
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4.0 Device Specifications (Continued) 

4.4.2.2 Input Signal Requirements: NS32GX32-20, NS32GX32-25, NS32GX32-30 


NS32GX32-20 NS32GX32-25 NS32GX32-30 



t C r 

(Note 1) 


l Cf 

(Note 1) 




tpWR 
(Note 1) 


Description 

Reference/Condltfons 

Input Clock Period 

R.E., CLK to Next 

R.E..CLK 

CLK High Time 

At 2.0V on CLK 
(Both Edges) 

CLK Low Time 

At 0.8V on CLK 
(Both Edges) 

CLK Rise Time 

0.8V to 2.0V on R.E., CLK 

CLK Fall Time 

2.0V to 0.8V on F.E., CLK 

Data In Setup 

Before R.E., BCLKT1 orTi 

Data In Hold 

After R.E..BCLKT1 orTi 

RDY Setup Time 

Before R.E., BCLKT2(W), 

T1 orTi 

RDY Hold Time 

Ater R.E., BCLK T2(W), 

T1 orTi 

BW0-1 Setup Time 

Before F.E., BCLK T2 or T2(W) 

BW0-1 Hold Time 

After F.E., BCLK T2 or T2(W) 

HOLD Setup Time 

Before F.E., BCLK 

HOLD Hold Time 

After F.E., BCLK 

BIN Setup Time 

Before F.E., BCLK T2 or T2(W) 

BIN Hold Time 

After F.E., BCLK T2 or T2(W) 

BER Setup Time 

Before R.E., BCLKT1 orTi 

BER Hold Time 

After R.E., BCLK T1 orTi 

BRT Setup Time 

Before R.E., BCLKT1 orTi 

BRT Hold Time 

After R.E., BCLKT1 orTi 

IODEC Setup Time 

Before F.E., BCLK T2 or T2(W) 

IODEC Hold Time 

After F.E., BCLK T2 or T2(W) 

Power Stable to 

R.E. of RST 

After VCC Reaches 4.5V 

RST Setup Time 

Before R.E., BCLK 

RST Pulse Width 

At 0.8V (Both Edges) 
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4.0 Device Specifications (Continued) 

4.4.3 Timing Diagrams 


ANY 



TL/EE/ 10253-43 
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4.0 Device Specifications (Continued) 


ANY 



TL/EE/10253-44 


Note: An Idle State Is always inserted before a Write Cycle when the 
Write Immediately follows a confirmed Read Cycle. AO-31, DDIN, 
BEO-3, STO-4 remain unchanged during this idle state. 

FIGURE 4-6. Write Cycle Timing 
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4.0 Device Specifications (Continued) 

ANY 

I T- STATE | T1 | T2 | T2B | T2B (W) | T1 or Tt | 



ANY 



TL/EE/ 10253-48 

FIGURE 4-10. Bus Error or Retry During Burst Cycles 

Note: Two idle state are always inserted by the CPU following the assertion of BRT. 
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4.0 Device Specifications (Continued) 



FIGURE 4-11. Extended Retry Timing 


TL/EE/ 1 0253-49 
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4.0 Device Specifications (Continued) 

ANY 



(Bus Initially Not Idle) 
ANY 



FIGURE 4-15. Slave Processor Write Timing 


ANY 



FIGURE 4-14. Slave Processor Read Timing 


BCLK 


SDN 


ru 

ru 



; i 

‘SDs -* 


u_ 

SDh 



f 





TL/EE/ 10253-54 

FIGURE 4-16. Slave Processor Done 


BCLK 


FSSR 


_ I I I 

LTULTL 

t 1 L L I ' ' 

’FSSR*-* *►* |. 

I (— VsSRh 

[Tir 


TL/EE/10253-55 


FIGURE 4-17. FSSR Signal Timing 
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4.0 Device Specifications (Continued) 



TL/EE/10253-57 

FIGURE 4-18. INT and NMI Signals Sampling 

Note 1: INT and NMi are sampled on every other rising edge of BCLK, starting with the second rising edge of BCLK after RSt goes high. 

Note 2: IN? is level sensitive, and once asserted, it should not be deasserted until it is acknowledged. 






FIGURE 4-22. Break Point Signal Timing 
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4.0 Device Specifications (Continued) 



TL/EE/1 0253-62 



TL/EE/1 0253-63 


(J7f 


BCLK 


RST 


-SS- 


n_n,jnL_n- 

VWR * 




TL/ EE/10253-64 


FIGURE 4-25. Power-On Reset 


bclkjj— ]_n 

—TTssJ” U 

~U~i 

RST [ \\\N> 

SS 

J " 


TL/EE/10253-65 


FIGURE 4-26. Non-Power-On Reset 
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Appendix A: Instruction Formats 

NOTATIONS: 

i = Integer Type Field 
B = 00 (Byte) 

W = 01 (Word) 

D = 1 1 (Double Word) 
f = Floating Point Type Field 
F = 1 (Std. Floating: 32 bits) 

L = 0 (Long Floating: 64 bits) 
c = Custom Type Field 
D = 1 (Double Word) 

Q = 0 (Quad Word) 
op = Operation Code 

Valid encodings shown with each format, 
gen, gen 1 , gen 2 = General Addressing Mode Field 
See Section 2.2 for encodings, 
reg = General Purpose Register Number 
cond = Condition Code Field 

0000 = EQual: Z = 1 

0001 = Not Equal: Z = 0 

0010 = Carry Set: C = 1 

001 1 = Carry Clear: C = 0 

0100 = Higher: L = 1 

0101 = Lower or Same: L = 0 

0110 = Greater Than: N = 1 

0111 = Less or Equal: N = 0 

1000 = Flag Set: F = 1 

1001 = Flag Clear: F = 0 

1010 = LOwer: L = 0 and Z = 0 

1011 = Higher or Same: L = 1 or Z = 1 

1100 = Less Than: N = 0 and Z = 0 

1101 = Greater or Equal: N = 1 or Z = 1 
1110 = (Unconditionally True) 

1111= (Unconditionally False) 
short = Short Immediate value. May contain: 

quick: Signed 4-bit value, in MOVQ, ADDQ, 
CMPQ, ACB. 

cond: Condition Code (above), in Scond. 
areg: CPU Dedicated Register, in LPR, SPR. 

0000 = US 

0001 = DCR 

0010 = BPC 

0011 = DSR 
0100 = CAR 
0101-0111 = (Reserved) 

1000 = FP 

1001 = SP 

1010 = SB 

1011 = USP 

1100 = CFG 

1101 = PSR 

1110 = INTBASE 

1111 = MOD 


Options: in String Instructions 


U/W 


T = Translated 
B = Backward 
U/W = 00: None 

01: While Match 
11: Until Match 


Configuration bits, in SETCFG Instruction: 


Eu 

1 

1 

1 

c 

Res 

F 

1 1 


Note: Reserved bit must be set to 0 when executing SETCFG. 


7 0 





I I i 

cond 

1 1 I 

10 10 


Format 0 



Bcond 

(BR) 


7 

0 




1 i I 

op 

1 1 1 

0 0 10 


Format 1 



BSR 

-0000 

ENTER 

-1000 

RET 

-0001 

EXIT 

-1001 

CXP 

-0010 

NOP 

-1010 

RXP 

-0011 

WAIT 

-1011 

RETT 

-0100 

DIA 


-1100 

RETI 

-0101 

FLAG 

-1101 

SAVE 

-0110 

SVC 

-1110 

RESTORE 

-0111 

BPT 

-1111 

15 


8 | 7 


0 


1 1 1 1 

gen 

I 1 I 

short 

— i i 

op 

1 — 

1 1 

i | 


Format 2 


ADDQ 

-000 

ACB 

-100 

CMPQ 

-001 

MOVQ 

-101 

SPR 

-010 

LPR 

-110 

Scond 

-011 





2-81 


NS32GX32-20/NS32GX32-25/NS32GX32-30 



NS32GX32-20/NS32GX32-25/NS32GX32-30 


Appendix A: Instruction Formats (Continued) 



i iii 

gen 

ill 

op 1 

— i — i — 

1 1 1 

— 1 — 1 — 1 
ll i 1 


Format 3 



CXPD 

-0000 

ADJSP 


-1010 

BICPSR 

-0010 

JSR 


-1100 

JUMP 

-0100 

CASE 


-1110 

BISPSR 

-0110 




Trap (UND) on XXXI, 1000 





15 

8| 7 


0 


1 1 1 1 

gen 1 

1 1 1 1 
gen 2 

1 1 

op 

" 1 1 1 

1 ' 1 


Format 4 



ADD 

-0000 

SUB 


-1000 

CMP 

-0001 

ADDR 


-1001 

BIC 

-0010 

AND 


-1010 

ADDC 

-0100 

SUBC 


-1100 

MOV 

-0101 

TBIT 


-1101 

OR 

-0110 

XOR 


-1110 

23 

16 1 15 

8 7 


0 

1 i i i 

0 0 0 0 0 

ill II 

short 0 op 

i 1 

i 0 0 

i i 

0 0 1 

1 1 1 

1 1 o| 


Format 5 



MOVS 

-0000 

SETCFG 


-0010 

CMPS 

-0001 

SKPS 


-0011 

Trap (UND) on 1XXX, 01 XX 




23 

16 15 

8 k 


0 


EXT 

CVTP 

INS 

CHECK 

MOVSU 

MOVUS 


16|15 

I I I I I I 

gen 2 reg 


-o oo IN 

-o 01 FF 

-010 
-Oil 

-110, reg = 001 
-110, reg = 011 


10 1110 
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23 

16 15 

8 7 

0 

1 1 1 i 1 

gen 1 

i i — — i 1 — i 

gen 2 op 

t 111 

f i 0 0 1 

1 — I — 1 — 1 — 1 — 

11110 


Format 9 


MOVif 

-000 

ROUND 

-100 

LFSR 

-001 

TRUNC 

-101 

MOVLF 

-010 

SFSR 

-110 

MOVFL 

-011 

FLOOR 

-111 



7 

0 



1 1 1 

0 111 

Ml 1 
1110 


TL/EE/1 0253-67 


Format 10 


Trap (UND) 
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Appendix A: Instruction Formats (Continued) 


- 1 — 1 — 1 — 1 — 

gen 1 

i i li 

gen 2 

T — 1 — 1 — — 

op 0 f 1 

— i — i — i — i — i — 

111111 


Format 12 


Note 2 

-0000 

Note 2 

-1000 

Note 1 

-0001 

Note 1 

-1001 

POLYf 

-0010 

Note 3 

-1010 

DOTf 

-0011 

Note 1 

-1011 

SCALBf 

-0100 

Note 2 

-1100 

LOGBf 

-0101 

Note 1 

-1101 

Note 2 

-0110 

Note 2 

-1110 

Note 1 

-0111 

Note 1 

-1111 


|1 0 0 11 1 1 0 | 

TL/EE/10253-68 


Format 13 


Trap (UND) Always 


23 

16 15 


8 

7 0 

1 1 i 1 

gen 1 

1 1 

short 0 

i i i 

op 

1 

i 

— i — i — i — i — i — i — i — 

0 0 0 1 1 1 1 0 


Format 14 

CINV -1001 

Trap (UND) on 00XX, 01 XX, 1000, 1 01 X, 11 XX 


Operation Word 

Format 15 


8 7 0 

i i i i i i i 

n n n 1 0 1 1 0 
ID Byte 


(Custom Slave) 

Operation Word Format 


23 

16 15 


8 

1 II 1 

gen 1 

short x 

ii i 

op 

g: 


Format 15.0 


Trap (UND) on all others 


23 

16 15 

8 

1 ' 1 1 1 

gen 1 

1 1 i 

gen 2 

r "i i 

op c i 


Format 15.1 


CCV3 

-000 

CCV2 

-100 

LCSR 

-001 

CCV1 

-101 

CCV5 

-010 

SCSR 

-110 

CCV4 

-011 

ccvo 

-111 


23 

16| 15 


101 

1 1 1 1 ' 

gen 1 

1 1 1 1 
gen 2 

1 "'ll " 

op X 


Format 15.5 


CCAL0 

0000 

CCAL3 

-1000 

CMOVO 

0001 

CMOV3 

-1001 

CCMP0 

0010 

Note 3 

-1010 

CCMP1 

0011 

Note 1 

-1011 

CCAL1 

0100 

CCAL2 

-1100 

CMOV2 

0101 

CMOV1 

-1101 

Note 2 

0110 

Note 2 

-1110 

Note 1 

0111 

Note 1 

-1111 


23 

16| 15 


111 

1(11 

genl 

i i It 
gen 2 

1 1 1 

op X 


Format 15.7 


Note 2 

0000 

Note 2 

-1000 

Note 1 

0001 

Note 1 

-1001 

Note 3 

0010 

Note 3 

-1010 

Note 3 

0011 

Note 1 

-1011 

Note 2 

0100 

Note 2 

-1100 

Note 1 

0101 

Note 1 

-1101 

Note 2 

0110 

Note 2 

-1110 

Note 1 

0111 

Note 1 

-1111 


If nnn = 010, 011, 100, 110 then Trap (UND) Always. 

7 


Format 16 


Trap (UND) Always 


Trap (UND) Always 


1 ° 1 0 1 1 1 1 °1 

TL/EE/1 0253-69 


7 0 

I I V 1 I I 1 I I 
110 11110 

TL/EE/1 0253-70 


7 0 

11111111] 
1 0 0 0 1 1 1 0 

TL/EE/1 0253-71 


2-83 


NS32GX32-20/NS32GX32-25/NS32GX32-30 






NS32GX32-20/NS32GX32-25/NS32GX32-30 


Appendix A: Instruction 
Formats (Continued) 


Trap (UNO) Always 


Format 18 


7 0 

I TT I II I I I 

X X X 0 0 1 t 0 


TL/EE/1 0253-72 


Format 19 


Trap (UND) Always 

Implied Immediate Encodings: 

7 0 


1 1 1 

,7 1 re 1 r5 1 

i i 1 i r~ 

! r4 ! r3 ( r2 ( rl ( rO 

Register Mark, Appended to SAVE, ENTER 

7 

0 

1 1 1 

i i rl i 1 i 

1 1 1 1 1 

1 r3 | r4 | r5 | r6 | r7 

Register Mark, Appended to RESTORE, EXIT 

7 

0 

1 1 

offset 

1 1 

1 1 I 1 

length - 1 

till 


Offset/Length Modifier Appended to INSS, EXTS 


Note 1: Opcode not defined; CPU treats like MOVf or CMOV c . First operand 
has access class of read; second operand has access class of write; f or c 
field selects 32- or 64-bit data. 

Note 2: Opcode not defined; CPU treats like ADDf or CCALc. First operand 
has access class of read;, second operand has access class of read-modify- 
write; f or c field selects 32- or 64-bit data. 

Note 3: Opcode not defined; CPU treats like CMP ( or CCMP c . First operand 
has access class of read;, second operand has access class of read; f or c 
field selects 32- or 64-bit data. 

Appendix B. Compatibility Issues 

The NS32GX32 is compatible with the Series 32000 archi- 
tecture implemented by the NS32532, NS32032, NS32332, 
and previous microprocessors in the family. Compatibility 
means that within certain limited constraints, programs that 
execute on one of the earlier Series 32000 microprocessors 
will produce identical results when executed on the 
NS32GX32. Compatibility applies to privileged operating 
systems programs, as well as to non-privileged applications 
programs. This appendix explains both the restrictions on 
compatibility with previous Series 32000 microprocessors 
and the extensions to the architecture that are implemented 
by the NS32GX32. 

B.1 RESTRICTIONS ON COMPATIBILITY 

If the following restrictions are observed, then a program 
that executes on an earlier Series 32000 microprocessor 
will produce identical results when executed on the 
NS32GX32 in an appropriately configured system: 

1 . The program is not time-dependent. For example, the 
program should not use instruction loops to control real- 
time delays. 

2. The program does not use any encodings of instruc- 
tions, operands, addresses, or control fields identified to 


be reserved or undefined. For example, if the count op- 
erand’s value for an LSHi instruction is not within the 
range specified by the Series 32000 Instruction Set Ref- 
erence Manual, then the results produced by the 
NS32GX32 may differ from those of the NS32032. 

3. The program does not depend on the use of a Memory 
Management Unit (MMU). 

4. The program does not depend on the detection of bus 
errors according to the implementation of the NS32332. 
For example, the NS32GX32 distinguishes between re- 
startable and nonrestartable bus errors by transferring 
control to the appropriate bus-error exception service 
procedure through one of two distinct entries in the In- 
terrupt Dispatch Table. In contrast, the NS32332 uses a 
single entry in the Interrupt Dispatch Table for all bus 
errors. 

5. The program does not modify itself. Refer to Section B.4 
for more information. 

6. The program does not depend on the execution of cer- 
tain complex instructions to be non-interruptible. Refer 
to Section B.5 on. “Memory-Mapped I/O” for more in- 
formation. 

7. The program does not use the custom slave instructions 
CATSTO and CATST 1 , as they are not supported by the 
NS32GX32 and will result in a Trap (UND) when their 
execution is attempted. 

B.2 ARCHITECTURE EXTENSIONS 

The NS32GX32 implements the following extensions of the 
Series 32000 architecture using previously reserved control 
bits, instruction encodings, and memory locations. Exten- 
sions implemented earlier in the NS32332, such as 32-bit 
addressing, are not listed. 

1 . The DC, LDC, 1C, and LIC bits in the CFG register have 
been defined to control the on-chip Instruction and Data 
Caches. The DE-bit in the CFG register has been de- 
fined to enable Direct-Exception Mode. 

2. The V-flag in the PSR register has been defined to en- 
able the Integer-Overflow Trap. 

3. The DCR, BPC, DSR, and CAR registers have been de- 
fined to control debugging features. Access to these 
registers has been added to the definition of the LPR 
and SPR instructions. 

4. Access to the CFG and SP1 registers has been added 
to the definition of the LPR and SPR instructions. 

5. The CINV instruction has been defined to invalidate 
control of the on-chip Instruction and Data Caches. 

6. Direct-Exception Mode has been added to support fast- 
er interrupt service time and systems without module 
tables. 

7. A new entry has been added to the Interrupt Dispatch 
Table for supporting vectors to distinguish between re- 
startable and nonrestartable bus errors. Two additional 
entries support Trap (OVF) and Trap (DBG). 

B.3 INTEGER OVERFLOW TRAP 

A new trap condition is recognized for integer arithmetic 
overflow. Trap (OVF) is enabled by the V-flag in the PSR. 
This new trap is important because detection of integer 
overflow conditions is required for certain programming lan- 
guages, such as ADA, and the PSR flags do not indicate the 
occurrence of overflow for ASHi, DIVi and MULi instructions. 
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More details on integer overflow are given in Section 3.2.5, 
where a description of all the cases in which an overflow 
condition is detected is also provided. 

INTEGER ARITHMETIC 

The V-flag in the PSR enables Trap (OVF) to occur following 
execution of an integer arithmetic instruction whose result 
cannot be represented exactly in the destination operand’s 
location. 

If the number of bits required to represent the resulting quo- 
tient of a DEI instruction exceeds half the number of bits of 
the destination, then the contents of both the quotient and 
remainder stored in the destination are undefined. 

The ADDR instruction can be used in place of integer arith- 
metic instructions to perform certain calculations. In this 
case however, integer overflow is not detected by the CPU. 

LOGICAL INSTRUCTIONS 

The V-flag in the PSR enables Trap (OVF) to occur following 
execution of an ASHi instruction whose result cannot be 
represented exactly in the destination operand’s location. 

ARRAY INSTRUCTIONS 

The V-flag in the PSR enables Trap (OVF) to occur following 
execution of a CHECKi instruction whose source operand is 
out of bounds. 

PROCESSOR CONTROL INSTRUCTIONS 

The V-flag in the PSR enables Trap (OVF) to occur following 
execution of an ACBi instruction if the sum of the “inc” val- 
ue and the “index” operand cannot be represented exactly 
in the "index” operand’s location. 

B.4 SELF-MODIFYING CODE 

The Series 32000 architecture does not have special provi- 
sions to optimally support self-modifying programs. 
Nevertheless, on the NS32332 and previous Series 32000 
microprocessors it is possible to execute self-modifying 
code according to the following sequence: 

1. Modify the appropriate instruction. 

2. Execute a JUMP instruction or other instruction that 
causes the microprocessor’s instruction queue to be 
flushed. 

3. Execute the modified instruction. 

For example, an interactive debugger may follow the se- 
quence above after reaching a breakpoint in a program be- 
ing monitored. 

The same program may not produce identical results when 
executed on the NS32GX32 due to effects of the Instruction 
Cache and branch prediction. In order to execute self-modi- 
fying code on the NS32GX32 it is necessary to do the fol- 
lowing: 

1 . Modify the appropriate instruction. 

2. If the modified instruction is on a cacheable page, exe- 
cute CINV to invalidate the contents of the Instruction 
Cache. 

3. Execute an instruction that causes a serializing opera- 
tion. See Section 3.1. 3.3. 

4. Execute the modified instruction. 


B.5 MEMORY-MAPPED I/O 

As was mentioned in Section 3.1. 3. 2, certain peripheral de- 
vices exhibit characteristics identified as “destructive-read- 
ing” and "side-effects of writing” that impose requirements 
for special handling of memory-mapped I/O references. 
The NS32GX32 supports two methods to use on references 
to memory-mapped peripheral devices that exhibit either or 
both of these characteristics. 

For peripheral devices that exhibit only side-effects of writ- 
ing, correct operation can be ensured either by locating the 
device between addresses FFOOOOOO (hex) and FF7FFFFF 
(hex) in the address space or by observing the first 2 restric- 
tions listed below. For peripheral devices that exhibit de- 
structive-reading, all the following restrictions must be ob- 
served to ensure correct operation: 

1. References to the device mu st be inhibited while the 
CPU asserts the output signal IOINH. 

2. The input signal IODEC must be asserted by the system 
on references to the device. 

3. The device cannot be used for instruction fetches, reads 
of effective addresses. 

4. If an instruction that reads a source operand from the 
device crosses a page boundary, then no Trap (ABT) or 
restartable bus error can occur during fetches from the 
page with higher addresses. 

5. The device can be used as a source operand only for 
instructions in the list below. 


ABSi 

CBITi 

MOVMi 

SBITIi 

ADDi 

CBITIi 

MOVXi 

SUBi 

ADDCi 

CMPi 

MOVZi 

SUBCi 

ADDPi 

CMPQi 

NEGi 

SUBPi 

ADDQi 

COMi 

NOTi 

TBITi 

ANDi 

IBITi 

ORi 

XORi 

ASHi 

LSHi 

ROTi 


BICi 

MOVi 

SBITi 



This restriction arises because the CPU can respond to 
interrupt requests during the execution of complex in- 
struction in order to reduce interrupt latency. Thus, the 
CPU may read the source operands for a DEID instruc- 
tion (extended-precision divide), begin calculating the in- 
struction’s results, and then respond to an interrupt re- 
quest before completing the instruction. In such an 
event, the instruction can be executed again and com- 
pleted correctly after the interrupt service procedure re- 
turns unless one of the source operands was altered by 
destructive-reading. 

Appendix C. Instruction Set 
Extensions 

The following sections describe the differences and ex- 
tensions to the Series 32000 instruction set (as present- 
ed in the "Series 32000 Instruction Set Reference Man- 
ual”) implemented by the NS32GX32. 

No changes or additions have been made to the user- 
mode instruction set, and only a few privileged instruc- 
tions have been added. 
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C.1 PROCESSOR SERVICE INSTRUCTIONS 

The CFG register, User Stack Pointer (SP1), and Debug 
Registers can be loaded and stored using privileged forms 
of the LPRi and SPRi instructions. 

When the SETCFG instruction is executed, the CFG register 
bits 0 through 3 are loaded from the instruction’s short field, 
bits 4 through 7 are forced to 1, and bits 8 through 12 are 
forced to 0. 

The contents of the on-chip Instruction Cache and Data 
Cache can be invalidated by executing the privileged in- 
struction CINV. While executing the CINV instruction, the 
CPU generates 2 slave bus cycles on the system interface 
to display the first 3 bytes of the instruction and the source 
operand. 

C.2 INSTRUCTION DEFINITIONS 

This section provides a description of the operations and 
encodings of the new NS32GX32 privileged instructions. 

Load and Store Processor Registers 
Syntax: LPRI procreg, src 

short gen 

read.i 

SPRI procreg dest 

short gen 

write.i 

The LPRi and SPRi instructions can be used to load and 
store the User Stack Pointer (USP or SP1), the Configura- 
tion Register (CFG) and the Debug Registers in addition to 
the Processor Registers supported by the previous Series 
32000 CPUs. Access to these registers is privileged. 

Figure C-1 and Table C-1 show the instruction formats and 
the new ‘short’ field encodings for LPRi and SPRi. 

Flags Affected: No flags affected by loading or storing the 
USP, CFG, or Debug Registers. 

Traps: Illegal Instruction Trap (ILL) occurs if an 

attempt is made to load or store the USP, 
CFG or Debug Registers while the U-flag 
is 1. 


TABLE C-1. LPRi/SPRi New ‘Short’ Field Encodings 


gen 

| short 

11011| i 

src 

procreg 

LPRi 

15 

8 1 7 

0 

"Till 

gen 

1' \ 1 ” 

short 

1 ! I I 1 

0 10 11 i 

dest 

procreg 

SPRi 


FIGURE C-1. LPRi/SPRi Instruction Formats 


Register 

procreg 

short field 

Debug Condition Register 

DCR 

0001 

Breakpoint Program Counter 

BPC 

0010 

Debug Status Register 

DSR 

0011 

Compare Address Register 

CAR 

0100 

User Stack Pointer 

USP 

1011 

Configuration Register 

CFG 

1100 


Cache Invalidate 
Syntax: CINV options, src 
gen 
read. D 

The CINV instruction invalidates the contents of locations in 
the on-chip Instruction Cache and Data Cache. The instruc- 
tion can be used to invalidate either the entire contents of 
the on-chip caches or only a 16-byte block. In the latter 
case, the 28 most-significant bits of the source operand 
specify the physical address of the aligned 1 6-byte block; 
the 4 least-significant bits of the source operand are ig- 
nored. If the specified block is not located in the on-chip 
caches, then the instruction has no effect. If the entire 
cache contents is to be invalidated, then the source oper- 
and is read, but its value is ignored. 

Options are specified by listing the letters A (invalidate All), I 
(Instruction Cache), and D (Data Cache). If neither the I nor 
D option is specified, the instruction has no effect. 

In the instruction encoding, the options are represented in 
the A, I, and D fields as follows: 

A: 0 — invalidate only a 16-byte block 
1 — invalidate the entire cache 
I: 0 — do not affect the Instruction Cache 
1— invalidate the Instruction Cache 
D: 0 — do not affect the Data Cache 
1 — invalidate the Data Cache 
Flags Affected: None 

Traps: Illegal Operation Trap (ILL) occurs if an at- 

tempt is made to execute this instruction 
while the U-flag is 1. 

Examples: 

1. CINV A, D, I, R3 1EA7 1B 

2. CINV I, R3 IE 27 19 

Example 1 invalidates the entire Instruction Cache and Data 
Cache. 

Example 2 invalidates the 16-byte block whose physical ad- 
dress in the Instruction Cache is contained in R3. 
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Extensions < 

23 | 

Continued) 

5 8|7 0 

I 1 1 1 

gen 

0 

\ 1 

— i — i — i — i — i — i — — i — i — i — i — i — i — r~ 

D01001 1 10001 1 1 10 


src options CINV 

FIGURE C-2. CINV Instruction Format 


Appendix D. Instruction 
Execution Times 

The NS32GX32 achieves its performance by using an ad- 
vanced implementation incorporating a 4-stage Instruction 
Pipeline, an Instruction Cache and a Data Cache into a sin- 
gle integrated circuit. 

As a consequence of this advanced implementation, the 
performance evaluation for the NS32GX32 is more complex 
than for the previous microprocessors in the Series 32000 
family. In fact, it is no longer possible to determine the exe- 
cution time for an instruction using only a set of tables for 
operations and addressing modes. Rather, it is necessary to 
consider dependencies between the various instructions ex- 
ecuting in the pipeline, as well as the occurrence of misses 
for the on-chip caches. 

The following sections explain the method to evaluate the 
performance of the NS32GX32 by calculating various timing 
parameters for an instruction sequence. Due to the high 
degree of parallelism in the NS32GX32, the evaluation tech- 
niques presented here include some simplifications and ap- 
proximations. 

D.1 INTERNAL ORGANIZATION 
AND INSTRUCTION EXECUTION 

The NS32GX32 is organized internally as 8 functional units 
as shown in Figure 1. The functional units operate in parallel 
to execute instructions in the 4-stage pipeline. The structure 
of this pipeline is shown in Figure 3-2. The Instruction Fetch 
and Instruction Decode pipeline stages are implemented in 
the loader along with the 8-byte instruction queue and the 
buffer for a decoded instruction. The Address Calculation 
pipeline stage is implemented in the address unit. The Exe- 
cute pipeline stage is implemented in the Execution Unit 
along with the write data buffer that holds up to two results 
directed to memory. 

The Address Unit and Execution Unit can process instruc- 
tions at a peak rate of 2 clock cycles per instruction, en- 
abling a sustained pipeline throughput at 30 MHz of 
15 MIPS (million instructions per second) for sequences of 
register-to-register, immediate-to-register, memory-to-regis- 
ter instructions and register-to-memory. Nevertheless, the 
execution of instructions in the pipeline is reduced from the 
peak throughput of 2 cycles by the following causes of de- 
lay: 

1. Complex operations, like division, require more than 2 cy- 
cles in the Execution Unit, and complex addressing 
modes, like memory relative, require more than 2 cycles 
in the Address Unit. 

2. Dependencies between instructions can limit the flow 
through the pipeline. A data dependency can arise when 
the result of one instruction is the source of a following 
instruction. Control dependencies arise when branching 
instructions are executed. Section D.3 describes the 
types of instruction dependencies that impact perform- 
ance and explains how to calculate the pipeline delays. 


3. Cache misses can cause the flow of instructions through 
the pipeline to be delayed, as can non-aligned refer- 
ences. Section D.4 explains the performance impact for 
these forms of storage delays. 

The effective time T 0 ff needed to execute an instruction is 
given by the following formula: 

Teff = T e + T d + T s 

T e is the execution time in the pipeline in the absence of 
data dependencies between instructions and storage de- 
lays, Td is the delay due to data dependencies, and T s is the 
effect of storage delays. 

D.2 BASIC EXECUTION TIMES 

Instruction flow in sequence through the pipeline stages im- 
plemented by the Loader, Address Unit, and Execution Unit. 
In almost all cases, the Loader is at least as fast at decod- 
ing an instruction as the Address Unit is at processing the 
instruction. Consequently, the effects of the Loader can be 
ignored when analyzing the smooth flow of instructions in 
the pipeline, and it is only necessary to consider the times 
for the Address Unit and Execution Unit. The time required 
by the Loader to fetch and decode instructions is significant 
only when there are control dependencies between instruc- 
tions or Instruction Cache misses, both of which are ex- 
plained later. 

The time for the pipeline to advance from one instruction to 
the next is typically determined by the maximum time of the 
Address Unit and Execution Unit to complete processing of 
the instruction on which they are operating. For example, if 
the Execution Unit is completing instruction n in 2 cycles 
and the Address Unit is completing instruction n+ 1 in 4 
cycles, then the pipeline will advance in 4 cycles. For certain 
instructions, such as RESTORE, the Address Unit waits until 
the Execution Unit has completed the instruction before 
proceeding to the next instruction. When such an instruction 
is in the Execution Unit, the time for the pipeline to advance 
is equal to the sum of the time for the Execution Unit to 
complete instruction n and the time for the Address Unit to 
complete instruction n+ 1. The processing times for the 
Loader, Address Unit, and Execution Unit are explained be- 
low. 

D.2.1 Loader Timing 

The Loader can process an instruction field on each clock 
cycle, where a field is one of the following: 

• An opcode of 1 to 3 bytes including addressing mode 
specifiers. 

• Up to 2 index bytes, if scaled index addressing mode is 
used. 

• A displacement. 

• An immediate value of 8, 16 or 32 bits. 

The Loader requires additional time in the following cases: 

• 1 additional cycle when 2 consecutive double-word fields 
begin at an odd address. 

• 2 cycles in total to process a double-precision floating- 
point immediate value. 
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D.2.2 Address Unit Timing 

The processing time of the Address Unit depends on the 
instruction’s operation and the number and type of its gen- 
eral addressing modes. The basic time for most instructions 
is 2 cycles. A relatively small number of instructions require 
an additional address unit time, as shown in the timing ta- 
bles in Section D.5.5. Floating-point instructions as well as 
Custom-Slave instructions require an additional 3 cycles 
plus 2 cycles for each quad-word operand in memory. 

For instructions with 2 general addressing modes, 2 addi- 
tional cycles are required when both addressing modes re- 
fer to memory. Certain general addressing modes require an 
additional processing time, as shown in Table D-1. For ex- 
ample, the instruction MOVD 4(8(FP)), TOS requires 7 cy- 
cles in the Address Unit; 2 cycles for the basic time, an 
additional 2 cycles because both modes refer to memory, 
and an additional 3 cycles for Memory Relative addressing 
mode. 

TABLE D-1. Additional Address Unit Processing 
Time for Complex Addressing Modes 


n: ADDD R1,R0 ; modify RO 

n+1: MOVD 4(R0),R2 ; RO is base register, 
delay 3 cycles 

The delay is 1 cycle when the register is modified 2 instruc- 
tions before its use as a base register, as shown in this 
example. 

n: ADDD R1,R0 ; modify RO 

n+1: MOVD 4(SP),R3 ; RO not used 
n+2: MOVD 4(R0),R2 ; RO is base register, 
delay 1 cycle 

When an instruction uses an index register that is the desti- 
nation of the previous instruction, a delay of 1 cycle occurs, 
as shown in the example below. If the register is modified 2 
or more instructions prior to its use as an index register, 
then no delay occurs. 

n: ADDD R1,R0 ; modify RO 

n+1: MOVD 4(SP) [RO :B] ,R2 

; RO is index register 
delay 1 cycle 

Bypass circuitry in the Execution Unit generally avoids delay 
when a register modified by one instruction is used as the 
source operand of the following instruction, as in the follow- 
ing example. 

n: ADDD R1,R0 ; modify RO 

n+1: MOVD R0,R2 ; RO is source register, 

no delay 

For the uncommon case where the operand in the source 
register is larger than the destination of the previous instruc- 
tion, a delay of 2 cycles occurs. Here is an example. 

n: ADDB R1,R0 ; modify byte in RO 

n+1: MOVD R0,R2 ; RO dw source operand, 

2 cycle delay 

Note: The Address Unit does not make any differentiation between CPU 
and FPU registers. Therefore, register interlocks can occur between 
integer and floating-point Instructions. 

D.3.1.2 Memory Interlocks 

When an instruction reads a source operand (or address for 
effective address calculation) from memory that depends on 
the destination of either of the previous 2 instructions, a 
delay occurs. The CPU detects a dependency between a 
read and a write reference in the following cases, which 
include some false dependencies in addition to all actual 
dependencies: 

• Either reference crosses a double-word boundary 

• Address bits 0 through 1 1 are equal 

• Address bits 2 through 1 1 are equal and either reference 
is for a word 

• Address bits 2 through 1 1 are equal and either reference 
is for a double-word 

The delay for a memeory interlock is 4 cycles when, as in 
the following example, the memory location is modified by 
the immediately preceding instruction. 

n: ADDQD 1,4(SP) ; modify 4(SP) 
n+1: CMPD 10,4(SP) ; read, 4(SP), 

4 cycle delay 


Mode 

Additional 

Cycles 

Memory Relative 

3 

External 

8 

Scaled Indexing 

2 


D.2.3 Execution Unit Timing 

The Execution Unit processing times for the various 
NS32GX32 instructions are provided in Section D.5.5. Cer- 
tain operations cause a break in the instruction flow through 
the pipeline. 

Some of these operation simply stop the Address Unit, 
while others flush the instruction queue as well. The infor- 
mation on how to evaluate the penalty resulting from in- 
struction flow breaks is provided in the following sections. 

D.3 INSTRUCTION DEPENDENCIES 

Interactions between instructions in the pipeline can cause 
delays. Two types of interactions can arise, as described 
below. 

D.3.1 Data Dependencies 

In certain circumstances the flow of instructions in the pipe- 
line will be delayed when the result of an instruction is used 
as the source of a succeeding instruction. Such interlocks 
are automatically detected by the microprocessor and han- 
dled with complete transparency to software. 

D.3.1. 1 Register Interlocks 

When an instruction uses a base register that is the destina- 
tion of either of the previous 2 instructions, a delay occurs. 
Modifications of the Stack Pointer resulting from the use of 
TOS addressing mode do not cause any delay. Also, there 
is no delay for a data dependency when the instruction that 
modifies the register is one for which the Address Unit 
stops. The delay is 3 cycles when, as in the following exam- 
ple, the base register is modified by the immediately preced- 
ing instruction. 


n: ADDD R1,R0 
n+1: MOVD 4(SP) ,R3 
n+2: MOVD 4(R0),R2 


2-88 





Appendix D. Instruction Execution Times (Continued) 


The delay is 2 cycles when the memory location is modified 
2 instructions before its use as a source operand or effec- 
tive address, as shown in this example. 

n: ADDQD 1,4(SP) ; modify 4 ( SP) 
n+1: MOVD R0,R1 ; no reference to 4(SP) 

n+2 : CMPD 10, 4(SP) ; read 4 (SP) , 

2 cycles delay 

Certain sequences of read and write references can cause 
a delay of 1 cycle although there is no data dependency 
between the references. This arises because the Data 
Cache is occupied for 2 cycles on write references. In the 
absence of data dependencies, read references are given 
priority over write references. Therefore, this delay only oc- 
curs when an instruction with destination in memory is fol- 
lowed 2 instructions later by an instruction that refers to 
memory (read or write) and 3 instructions later by an instruc- 
tion that reads from memory. Here is an example: 

n: MOVD R0,4(SP) ; memory write 
n+1: MOVD R6.R7 ; any instruction 

n+2: MOVD 8(SP) ,R0 ; memory read or write 
n+3: MOVD 12(SP),R1; memory read 

delayed 1 cycle 
D.3.2 Control Dependencies 

The flow of instructions through the pipeline is delayed 
when the address from which to fetch an instruction de- 
pends on a previous instruction, such as when a conditional 
branch is excuted. The Loader includes special circuitry to 
handle branch instructions (ACB, BR, Bcond, and BSR) that 
serves to reduce such delays. When a branch instruction is 
decoded, the Loader calculates the destination address and 
selects between the sequential and non-sequential instruc- 
tion streams. The non-sequential stream is selected for un- 
conditional branches. For conditional branches the selec- 
tion is based on the branch’s direction (forward or back- 
ward) as well as the tested condition. The branch is predict- 
ed taken in any of the following cases. 

• The branch is backward. 

• The tested condition is either NE or LE. 

Measurements have shown that the correct stream is se- 
lected for 64% of conditional branches and 71% of total 
branches. 

If the Loader selects the non-sequential stream, then the 
destination address is transferred to the Instruction Cache. 
For conditional branches, the Loader saves the address of 
the alternate stream (the one not selected). When a condi- 
tional branch instruction reaches the Execution Unit, the 
condition is resolved, and the Execution Unit signals the 
Loader whether or not the branch was taken. If the branch 
had been incorrectly predicted, the Instruction Cache be- 
gins fetching instructions from the correct stream. 

The delay for handling a branch instruction depends on 
whether the branch is taken and whether it is predicted cor- 
rectly. Unconditional branches have the same delay as cor- 
rectly predicted, taken conditional branches. 

Another form of delay occurs when 2 consecutive condition- 
al branch instructions are executed. This delay of 2 cycles 
arises from contention for the register that holds the alter- 
nate stream address in the Loader. 

Control dependencies also arise when JUMP, RET, and oth- 
er non-branch instructions alter the sequential execution of 
instructions. 


D.4 STORAGE DELAYS 

The flow of instructions in the pipeline can be delayed by 
off-chip memory references that result from misses in the 
on-chip storage buffers and by misalignment of instructions 
and operands. These considerations are explained in the 
following sections. The delays reported assume no wait 
states on the external bus and no interference between in- 
struction and data references. 

D.4.1 Instruction Cache Misses 

An Instruction Cache miss causes a 5 cycle gap in the fetch- 
ing of instructions. When the miss occurs for a non-sequen- 
tial instruction fetch, the pipeline is idle for the entire gap, so 
the delay is 5 cycles. When the miss occurs for a sequential 
fetch, the pipeline is not idle for the entire gap because 
instructions that have been prefetched ahead and buffered 
can be executed. The delay for misses on non-sequential 
instruction fetches can be estimated to be approximately 
half the gap, or 2.5 cycles. 

D.4. 2 Data Cache Misses 

A Data Cache miss causes a delay of 2 cycles. When a 
burst read cycle is used to fill the cache block, then 3 addi- 
tional cycles are required to update the Data Cache. In case 
a burst cycle is used and either of the 2 instructions follow- 
ing the instruction that caused the miss also reads from 
memory, then an additional delay occurs: 3 cycle delay 
when the instruction that reads from memory immediately 
follows the miss, and 2 cycle delay when the memory read 
occurs 2 instructions after the miss. 

D.4.3 Instruction and Operand Alignment 

When a data reference (either read or write) crosses a dou- 
ble-word boundary, there is a delay of 2 cycles. 

When the opcode for a non-sequential instruction crosses a 
double-word boundary, there is a delay of 1 cycle. No delay 
occurs in the same situation for a sequential instruction. 
There is also a delay of 2 cycles when an instruction fetch is 
located on a different page from the previous fetch and 
there is a hit in the Instruction Cache. This delay, which is 
due to the time required to translate the new page’s ad- 
dress, also occurs following any serializing operation. 

D.5 EXECUTION TIME CALCULATIONS 

This section provides the necessary information to calculate 
the T e portion of the effective time required by the CPU to 
execute an instruction. 

The effects of data dependencies and storage delays are 
not taken into account in the evaluation of T e , rather, they 
should be separately evaluated through a careful examina- 
tion of the instruction sequence. 

The following assumptions are made: 

— The entire instruction, with displacements and immedi- 
ate operands, is present in the instruction queue when 
needed. 

— All memory operands are available to the Execution Unit 
and Address Unit when needed. 

— Memory writes are performed at full speed through the 
write buffer. 

— Where possible, the values of operands are taken into 
consideration when they affect instruction timing, and a 
range of times is given. When this is not done, the worst 
case is assumed. 
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Appendix D. Instruction Execution Times (Continued) 


D.5.1 Definitions 

T eu Time required by the Execution Unit to execute an 
instruction. 

T a u Total processing time in the Address Unit. 

T ac j Extra time needed by the Address Unit, in addition 
to the basic time, to process more complex cases. 
Tad can be evaluated as follows: 

T ad = T x + T y i + Ty2 

T x = 2 if the instruction has two general operands 
and both of them are in memory. 

0 otherwise. 

T y1 and T y 2 are related to operands 1 and 2 re- 
spectively. Their values are given below. 

Ty(i , 2) = 3 if Memory Relative 
8 if External 
2 if Scaled Indexing 
0 if any other addressing mode 
The following parameters are only used for floating-point 
execution time calculations. 

T anp Additional Address Unit time needed to process 
floating-point instructions (Section D.2.2). T anp can 
be calculated as follows: 

Tanp = 3 + 2 * (Number of 64-bit operands in 
memory) 

Ttcs Time required to transfer ID and Opcode, if no op- 
erand needs to be transferred to the slave. Other- 
wise, it is the time needed to transfer the last 32 
bits of operand data to the slave. In the latter case 
the transfer of ID and Opcode as well as any oper- 
and data except the last 32 bits is included in the 
Execution Unit timing. 

T tsc Time required by the CPU to complete the floating- 
point instruction upon receiving the DONE signal 
from the slave. This includes the time to process 
the DONE signal itself in addition to the time need- 
ed to read the result (if any) from the slave. 

I This parameter is related to the floating-point oper- 
and size as follows: 

Standard floating (32 bits): I = 0 
Long floating (64 bits): I = 1 

D.5.2 Notes on Table Use 

1 . In the T eu column the notation nl — > n2 means nl mini- 
mum, n2 maximum. 

2. In the notes column, notations held within angle brackets 
<> indicate alternatives in the operand addressing 
modes which affect the execution time. A table entry 
which is affected by the operand addressing may have 
multiple values, corresponding to the alternatives. This 
addressing notations are: 

<l> Immediate 
<R> CPU register 
<M> Memory 

<F> FPU register, either 32 or 64 bits 


<m> Memory, except Top of Stack 

<T> Top of Stack 

<x> Any addressing mode 

<ab> a and b represent the addressing modes of oper- 
ands 1 and 2 respectively. Both of them can be 
any addressing mode, (e.g., <MR> means 
memory to CPU register). 

3. The notation ‘Break K‘ provides pipeline status informa- 
tion after executing the instruction to which ‘Break K’ ap- 
plies. The value of K is interpreted as follows: 

K = 0 The Address Unit was stopped by the instruction 
but the pipeline was not flushed. The Address 
Unit can start processing the next instruction im- 
mediately. 

K > 0 The pipeline was flushed by the instruction. The 
Address Unit must wait for K cycles before it can 
start processing the next instruction. 

K < 0 The Address Unit was stopped at the beginning 
of the instruction but it was restarted |K| cycles 
before the end of it. The Address Unit can start 
processing the next instruction |K| cycles before 
the end of the instruction to which ‘Break K’ ap- 
plies. 

4. Some instructions must wait for pending writes to com- 
plete before being able to execute. The number of cycles 
that these instructions must wait for, is between 6 and 7 
for the first operand in the write buffer and 2 for the sec- 
ond operand, if any. 

5. The CBITIi and SBITIi instructions will execute a RMW 
access after waiting for pending writes. The extra time 
required for the RMW access is only 3 cycles since the 
read portion is overlapped with the time in the Execution 
Unit. 

6. The keyword defined for the Bcond instruction have the 
following meaning: 

BTPC Branch Taken, Predicted Correctly 
BTPI Branch Taken, Predicted Incorrectly 
BNTPC Branch Not Taken, Predicted Correctly 
BNTPI Branch Not Taken, Predicted Incorrectly 

D.5.3 T e ff Evaluation 

The T e portion of the effective execution time for a certain 

instruction in an instruction sequence is obtained by per- 
forming the following steps: 

1 . Label the current and previous instruction in the se- 
quence with n and n— 1 respectively. 

2. Obtain from the tables the values of T eu and T au for in- 
struction n and T eu for instruction n-1. 

3. For floating-point instructions, obtain the values of Tt cs 
and Ttsc- 

4. Use the following formula to determine the execution time 

T e - 

T e = func (T au (n), T eu (n-1), T f | t (n-1), 

Break (n-1)) + T eu (n) + T f | t (n) 
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func provides the amount of processing time in the Address 

Unit that cannot be hidden. Its definition is given below. 

0 if T au (n) <: (T eu (n-1) + Tfi t (n — 1 )) 

AND NOT Break (n — 1 ) 

Tau( n ) — T eu (n — 1) if T au (n) > (T eu (n — 1) + T fit(n 1 )) 
AND NOT Break (n-1) 

Tau(n) + K if (T au (n) + K) > 0 

AND Break (n-1) 

D.5.4 Instruction Timing Example 

This section presents a simple instruction timing example 
for a procedure that recursively evaluates the Fibonacci 
function. In this example there are no data dependencies or 
storage buffer misses; only the basic instruction execution 
times in the pipeline, control dependencies, and instruction 
alignment are considered. 

The following is the source of the procedure in C. 

0 

if (T au (n) + K) £ 0 

unsigned fib(x) 


AND Break (n-1) 

int x ; 

K is the value associated with Break (n-1). 

Tfit only applies to floating-point instructions and is al- 
ways 0 for other instructions. It is evaluated as follows: 

Tfit = ttcs + Tt sc + Tfp U 

Tf pu is the execution time in the Floating-Point Unit. 

( 

if (x > 2) 

return (fib(x-l) + fib(x-2)); 

else 

return (1) ; 

! 

5. Calculate the total execution time T e ff by using the follow- 
ing formula: 

T e ff = T e + T d + T s 

Where Tj and T s are dependent on the instruction se- 
quence, and can be obtained using the information pro- 
vided in Section D.4. 

The assembly code for the procedure with comments indi- 
cating the execution time is shown below. The procedure 
requires 26 cycles to execute when the actual parameter is 
less than or equal to 2 (branch taken) and 99 cycles when 
the actual parameter is equal to 3 (recursive calls). 

_fib: movd 

r3,tos 

2 cycles 


movd 

r4,tos 

2 cycles 


movd 

rl,r3 

2 cycles 


cmpqd 

$(2) ,r3 

2 cycles 


bge 

.LI 

2 cycles. Break 2 

If Branch Taken 

movd 

r3,rl 

2 cycles 


addqd 

$(-2) t rl 

2 cycles 


bsr 

-fib 

3 cycles 


movd 

r0,r4 

2 cycles + 4 Cycles 

due to RET 

movd 

r3,rl 

2 cycles 


addqd 

$(-l) ,rl 

2 cycles 


bsr 

_fib 

3 cycles 


addd 

r4 , rO 

2 cycles + 1 cycle 

alignment + 4 cycles due to RET 

movd 

tos,r4 

2 cycles 


movd 

tos,r3 

2 cycles 


ret 

#(0) 

4 cycles, break 4 


.align 4 




_L1: movqd 

#(1) .rO 

4 cycles + 4 cycles due to BGE 

movd 

tos,r4 

2 cycles 


movd 

tos,r3 

2 cycles 


ret 

8(0) 

4 cycles, Break 4 
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Appendix D. Instruction Execution Times (Continued) 

D.5.5 Execution Timing Tables 

The following tables provide the execution timing information for all the NS32GX32 instructions. The table for the floating-point 
instructions provides only the CPU portion of the total execution time. The FPU execution times can be found in the NS32381 
datasheet. 


D.5.5. 1 Basic Instructions 


Mnemonic 



BICi 


BICPSRi 



2 + T at j Wait for pending writes. 
Break 5 


2 + T a( j Wait tor pending writes. 
Break 5 


Modular 

Direct 



3 + T a d 


2 + T a d Breaks 


<R> 

<M> Break 0 


2 + T ad <M> 

Wait for pending writes. 
Execute interlocked 
RMW access. Break 5 


2 + T a d Break -3. 

If SRC is out of bounds 
and the V bit in the PSR 
is set, then add trap 
time. 


Mnemonic 


CINV 





6 + 8 » n 


2 


7 + 13 * n 





2 


5 


17 


21 


28 + 4 * i 











— 

Notes 

2 + T ad 

Wait for 
pending 
writes. 

Break 5 

2 + T a d 

n = number 
of elements. 
Break 0 

2 + Tad 


2 + T ad 

n = number 
of elements. 
Break 0 

2 + Tad 

n = number 
of elements. 
Break 0 

2 + Tgd 


4 + Tgd 


13 

Break 5 

11 +T ad 

Break 5 

5 + T a d 

i = 0/4/12 
forB/W/D. 
Break 0 

2 

Break 5 

2 + T a d 

i = 0/4/12 

for B/W/D 

3 

n = number 
of registers 
saved. 

Break 0 

2 

n = number 
of registers 
restored 

8 

<R> 

8 + Tad 

<M> 


Break -3 

6 

<R> 

6 + Tgd 

<M> 


Break -3 

2 + Tgd 

i = number 
of bytes 
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Appendix D. Instruction Execution Times (Continued) 

D.5.5.1 Basic Instructions (Continued) 


Notes 


No trap 
Trap, Modular 
Trap, Direct 
If trap then: 
(wait for 
pending writes; 
Break 5 j 


If <M> 
then Break 0 


Mnemonic 

T eu 

■SH 

FLAG 

4 

2 


32 

2 


21 

2 

IBITi 

10 

2 


14 

2 + Tad 

INDEXi 

43 

5 + T a d 

INSi 

15 

8 


18 

8 + Tad 

INSSi 

14 

6 


19 

6 + T a d 

JSR 

3 

9 + T a d 

JUMP 

3 

4 + Tgd 

LPRi 

6 

2 + Tgd 


5 

2 + Tgd 


7 

2 + Tad 

LSHi 

3 

2 + T ad 

MEIi 

13 + 2*i 

5 + Tad 

MODi 

(34 -* 49) 

2 + Tgd 


+ 4*i 


MOVi 

2 

2 + Tad 

MOVMi 

5 + 4 * n 

2 + Tgd 

MOVQi 

2 

2 + T a d 

MOVSi 




12 + 4*n 

2 + Tad 


14 + 8*n 

2 + Tad 

MOVST 

16 + 9*n 

2 + Tad 



SP, USP, SP, MOD. 
Break 0 

CPU Reg = CFG, 
INTBASE, DSR, 
BPC, UPSR. 

Wait for pending 
writes. 

Break 5 

CPU Reg = DCR, 
PSR CAR. Wait for 
pending writes. 
Break 5 


for B/W/D. 
Break 0 



of elements. 
Break 0 


n = number 
of elements. 


Options in effect. 
Break 0 


of elements. 
Break 0 


Mnemonic 


MOVSVi 

9 

MOVUSi 

11 

MOVXii 

2 

MOVZii 

2 

MULi 

13 + 2*1 


24 

NEGi 

2 

NOP 

2 

NOTi 

3 

ORi 

2 

QUOi 

(30 -+ 40) 


+ 4*i 

REMi 

(32 -»■ 42) 
+ 4*i 

RESTORE 

7 + 2 * n 

RET 

4 

RETI 

19 


13 


29 


22 

RETT 

14 


8 

ROTi 

7 

RXP 

8 

SCONDi 

3 

SAVE 

8 + 2 * n 

SBITi 

10 


14 


2 + T ac | 


2 + T a d 


2 + Tad 


2 + Tad 


2 + Tad 
2 + T a d 


2 + Tad 


2 


2 + Tad 


2 + Tgd 


2 + T a d 


2 + Tgd 


Wait for 
pending writes. 
Break 5 


Wait for 
pending writes. 
Break 5 


i = 0/4/12 
for B/W/D. 
General case. 

If MULD and 
0 £ SRC £ 255 





i = 0/4/12 

for B/W/D 

i = 0/4/12 

for B/W/D 

n = number 
of registers 
restored. 

Break 0 


Break 4 


Noncascaded, Modular 
Noncascaded, Direct 
Cascaded, Modular 
Cascaded, Direct 

Wait for 
pending writes. 

Break 5 


5 Modular 

5 Direct 


Wait for 
pending writes. 
Break 5 


2 + Tgd 


5 


2 + Tad 


2 



n = number 
of registers. 
Break 0 


2 <R> 

2 + T ad <M> 
Break 0 
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Appendix D. Instruction Execution Times (continued) 

D.5.5.2 Floating-Point Instructions, CPU Portion 

Mnemonic 

T eu 

T a u 

T"tca 

Ttac 

Notes 

MOVf, NEGf, 

2 

2 + T an p 

2 

1 

<FF> 

ABSf, LOGBf 

4 + 3*1 

2 + T anp + T a( j 

2 

1 

<MF> 


6 + 3*1 

2 + T anp 

2 

1 

<IF> 


6 + 3*1 

2 + T an p 

2 

1 

<TF> 


11+4*1 

2 + T anp + T a <j 

2 

3 + 2*1 

<FM> Break - (1 + 1) 


13 + 7*1 

2 + T anp + T ac j 

2 

3 + 2*1 

<MM>, <IM> Break - (1 + 1) 

ADDf, SUBf, 

2 

2 + T anp 

2 

1 

<FF> 

MULf, DIVf, 

4 + 3*1 

2 + T anp 

2 

1 

<MF> 

SCALBf 

6 + 3*1 

2 + T an p 

2 

1 

<IF> 


6 + 3*1 

2 + T an p 

2 

1 

<TF> 


17 + 7*1 

2 + T an p + T a d 

2 

3 + 2*1 

<FM> Break - (1 + 1) 


19+10*1 

2 + T an p + T ac j 

2 

3 + 2*1 

<MM>, <IM> Break - (1 + 1) 

ROUNDfi, TRUNCfi, 

11 

2 + T an p 

2 

3 + 2*1 

<FR> Break - 1 

FLOORfi 

11+4*1 

2 + T anp + T ac j 

2 

3 + 2*1 

<FM> Break - (1 + 1) 


13 

2 + T an p + T ac j 

2 

3 + 2*1 

<MR>, <IR> Break - 1 


13 + 7* 1 

2 + T anp + T ac j 

2 

3 + 2*1 

<MM>, <IM> Break - (1 + 1) 

CMPf 

18 

2 + T anp 

2 


<FF> 


20 + 3 * 1 

2 + T anp + T a d 

2 


<MF> 


23 + 3 * 1 

2 + T anp + T ac j 

2 


<FM> 


25 + 6 * 1 

2 + T an p + T ac | 

2 


<MM>, <IM>, <MI>, <ll> 






Break 3 

POLYf, DOTf 

2 

2 + T anp 

2 

1 

<FF> 


4 + 3*1 

2 + T an p + T ac j 

2 

1 

<MF> 


6 + 3*1 

2 + T anp 

2 

1 

<IF>, <TF> 


11+4*1 

2 + T anp + T ac | 

2 

1 

<FM> Break - (1 + 1) 


13 + 7*1 

2 + T anp + T ac j 

2 

1 

<MM>, < Ml > , <IM>, <ll> 






Break - (1 + 1) 

MOVif 

6 

2 + T anp 

2 

1 

<RF> 


13 

2 + T an p + T a d 

2 


<RM> Break - 1 


6 + 3*1 

2 + T an p + T a d 

2 

1 

<MF>, <IF>, <TF> 


13 + 7*1 

2 + T anp + T a d 

2 


<MM>, <IM> 






Break - (1 + 1) 

LFSR 

6 

2 + T an p 

2 

1 

<R> 


6 + 3*1 

2 + T anp + T a d 

2 

1 

<M> 


6 + 3*1 

2 + T an p 

2 

1 

<l> 


6 + 3*1 

2 + T anp 

2 

1 

<T> 

SFSR 

11 

2 + T anp + T a d 

2 

3 

Break - 1 

MOVFL 

4 

2 + T an p 

2 

1 

<FF> 


6 

2 + T anp + T a d 

2 

1 

<MF>, <IF>, <TF> 


15 

2 + T an p + T a d 

2 


<FM> Break 0 


17 

2 + T anp + T a d 

2 


<MM>, <IM> Break 0 

MOVLF 

4 

2 + T anp 

2 

1 

<FF> 


9 

2 + T an p + T a d 

2 

1 

<MF>, <IF>, <TF> 


15 

2 + T anp + T a d 

2 


<FM> Break 0 


20 

2 + T anp + T a d 

2 


<MM>, <IM> BreakO 
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National 

Semiconductor 


NS32CG 1 6-1 0/NS32CG 16-15 
High-Performance Printer/Display Processor 


PRELIMINARY 


General Description 

The NS32CG16 is a 32-bit microprocessor in the Series 
32000® family that provides special features for graphics 
applications. It is specifically designed to support page ori- 
ented printing technologies such as Laser, LCS, LED, Ion- 
Deposition and InkJet. 

The NS32CG16 provides a 16 Mbyte linear address space 
and a 16-bit external data bus. It also has a 32-bit ALU, an 
eight-byte prefetch queue, and a slave processor interface. 
The capabilities of the NS32CG16 can be expanded by us- 
ing an external floating point unit which interfaces to the 
NS32CG16 as a slave processor. This combination pro- 
vides optimal support for outline character fonts. 

The NS32CGl6’s highly efficient architecture, in addition to 
the built-in capabilities for supporting BITBLT (BIT-aligned 
BLock Transfer) operations and other special graphics func- 
tions, make the device the ideal choice to handle a variety 
of page description languages such as Postscript™ and 

pcl™. 


Features 

■ Software compatible with the Series 32000 family 

■ 32-bit architecture and implementation 

■ 16 Mbyte linear address space 

■ Special support for imaging applications such as print- 
ers, faxes and scanners 

— 18 graphics instructions 

— Binary compression/expansion capability for font 
storage using RLL encoding 

— Pattern magnification for Epson and HP LaserJet™ 
emulations 

— 6 BITBLT instructions on chip 

— Interface to an external BITBLT processing unit for 
very fast BITBLT operations (optional) 

■ Floating point support via the NS32081 or the NS32381 
for outline fonts, scaling and rotation 

■ On-chip clock generator 

■ Optimal interface to large memory arrays via the 
DP84xx family of DRAM controllers 

■ Power save mode 

■ High-speed CMOS technology 

■ 68-pin plastic PCC package 


Block Diagram 


ADD/DATA CONTROLS & STATUS 



2-96 















1.0 Product Introduction 

The NS32CG16 is a high speed CMOS microprocessor in 
the Series 32000 family. It is software compatible with all 
the other CPUs in the family. The device incorporates all of 
the Series 32000 advanced architectural features, with the 
exception of the virtual memory capability. 

Brief descriptions of the NS32CG16 features that are 
shared with other members of the family are provided be- 
low: 

Powerful Addressing Modes. Nine addressing modes 
available to all instructions are included to access data 
structures efficiently. 

Data Types. The architecture provides for numerous data 
types, such as byte, word, doubleword, and BCD, which may 
be arranged into a wide variety of data structures. 
Symmetric Instruction Set. While avoiding special case 
instructions that compilers can't use, the Series 32000 fami- 
ly incorporates powerful instructions for control operations, 
such as array indexing and external procedure calls, which 
save considerable space and time for compiled code. 
Memory-to-Memory Operations. The Series 32000 CPUs 
represent two-address machines. This means that each op- 
erand can be referenced by any one of the addressing 
modes provided. 

This powerful memory-to-memory architecture permits 
memory locations to be treated as registers for all useful 
operations. This is important for temporary operands as well 
as for context switching. 


Large, Uniform Addressing. The NS32CG16 has 24-bit 
address pointers that can address up to 16 megabytes with- 
out any segmentation; this addressing scheme provides 
flexible memory management without added-on expense. 
Modular Software Support. Any software package for the 
Series 32000 family can be developed independent of all 
other packages, without regard to individual addressing. In 
addition, ROM code is totally relocatable and easy to ac- 
cess, which allows a significant reduction in hardware and 
software cost. 

Software Processor Concept. The Series 32000 architec- 
ture allows future expansions of the instruction set that can 
be executed by special slave processors, acting as exten- 
sions to the CPU. This concept of slave processors is 
unique to the Series 32000 family. It allows software com- 
patibility even for future components because the slave 
hardware is transparent to the software. With future ad- 
vances in semiconductor technology, the slaves can be 
physically integrated on the CPU chip itself. 

To summarize, the architectural features cited above pro- 
vide three primary performance advantages and character- 
istics: 

• High-Level Language Support 

• Easy Future Growth Path 

• Application Flexibility 
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1.0 Product Information (Continued) 

1.1 NS32CG16 SPECIAL FEATURES 

In addition to the above Series 32000 features, the 
NS32CG16 provides features that make the device ex- 
tremely attractive for a wide range of applications where 
graphics support, low chip count, and low power consump- 
tion are required. 

The most relevant of these features are the graphics sup- 
port capabilities, that can be used in applications such as 
printers, CRT terminals, and other varieties of display sys- 
tems, where text and graphics are to be handled. 

Graphics support is provided by eighteen instructions that 
allow operations such as BITBLT, data compression/expan- 
sion, fills, and line drawing, to be performed very efficiently. 
In addition, the device can be easily interfaced to an exter- 
nal BITBLT Processing Unit (BPU) for high BITBLT perform- 
ance. 

The NS32CG16 allows systems to be built with a relatively 
small amount of random logic. The bus is highly optimized 
to allow simple interfacing to a large variety of DRAMs and 
peripheral devices. All the relevant bus access signals and 
clock signals are generated on-chip. The cycle extension 
logic is also incorporated on-chip. 

The device is fabricated in a low-power, double-poly, single 
metal, CMOS technology. It also includes a power-save fea- 
ture that allows the clock to be slowed down under software 
control, thus minimizing the power consumption. This fea- 
ture can be used in those applications where power saving 
during periods of low performance demand is highly desir- 
able. 

The bus characteristics and the power save feature are de- 
scribed in the “Functional Description” section. A general 
overview of BITBLT operations and a description of the 
graphics support instructions is provided in Section 2.4. De- 
tails on all the NS32CG16 instructions can be found in the 
NS32CG16 Printer/Display Processor Programmer's Refer- 
ence Supplement and the related NS32CG16 supplement. 
Below is a summary of the instructions that are directly ap- 
plicable to graphics along with their intended use. 

Instruction Application 

BBAND The BitBIt group of instructions provide a 
BBOR method of quickly imaging characters, creating 
BBFOR patterns, windowing and other block oriented 
BBXOR effects. 

BBSTOD 

BITWT 

EXTBLT 

MOVMP Move Multiple Pattern is a very fast instruction 

for clearing memory and drawing patterns and 
lines. 

TBITS Test Bit String will measure the length of 1 's or 
0’s in an image, supporting many data 
compression methods (RLL), TBITS may also 
be used to test for boundaries of images. 


Instruction Application 

SBITS Set Bit String is a very fast instruction for filling 
objects, outline characters and drawing 
horizontal lines. 

The TBITS and SBITS instructions support 
Group 3 and Group 4 CCITT communications 
(FAX). 

SBITPS Set Bit Perpendicular String is a very fast 

instruction for drawing vertical, horizontal and 
45° lines. 

In printing applications SBITS and SBITPS may 
be used to express portrait and landscape 
respectively from the same compressed font 
data. The size of the character may be scaled as 
it is drawn. 

SB IT The Bit group of instructions enable single pixels 

CBIT anywhere in memory to be set, cleared, tested 

701 j or inverted. 

IBIT 

INDEX The INDEX instruction combines a multiply-add 

sequence into a single instruction. This provides 
a fast translation of an X-Y address to a pixel 
relative address. 


2.0 Architectural Description 

2.1 REGISTER SET 

The NS32CG16 CPU has 17 internal registers grouped ac- 
cording to functions as follows: 8 general purpose, 7 ad- 
dress, 1 processor status and 1 configuration. Figure 2-1 
shows the NS32CG16 internal registers. 

Address 
<- 32 Bits -» 

PC 

SPO 

SP1 

FP 

SB 

INTBASE 
~ \ MOD 


Processor Status 
PSR 

FIGURE 2-1. NS32CG16 Internal Registers 

2.1.1 General Purpose Registers 

There are eight registers (R0-R7) used for satisfying the 
high speed general storage requirements, such as holding 
temporary variables and addresses. The general purpose 
registers are free for any use by the programmer. They are 
32 bits in length. If a general purpose register is specified for 


General Purpose 



Configuration 


CFG 
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2.0 Architectural Description (Continued) 

an operand that is 8 or 16 bits long, only the low part of the 
register is used; the high part is not referenced or modified. 

2.1.2 Address Registers 

The seven address registers are used by the processor to 
implement specific address functions. Except for the MOD 
register that is 16 bits wide, all the others are 32 bits. In the 
NS32CG1 6 only the lower 24 bits are implemented in the six 
32-bit address registers. The top 8 bits are always zero. A 
description of the address registers follows. 

PC — Program Counter. The PC register is a pointer to the 
first byte of the instruction currently being executed. The PC 
is used to reference memory in the program section. 

SPO, SP1 — Stack Pointers. The SPO register points to the 
lowest address of the last item stored on the INTERRUPT 
STACK. This stack is normally used only by the operating 
system. It is used primarily for storing temporary data, and 
holding return information for operating system subroutines 
and interrupt and trap service routines. The SP1 register 
points to the lowest address of the last item stored on the 
USER STACK. This stack is used by normal user programs 
to hold temporary data and subroutine return information. 
When a reference is made to the selected Stack Pointer 
(see PSR S-bit), the terms ‘SP Register’ or ‘SP’ are used. 
SP refers to either SPO or SP1 , depending on the setting of 
the S bit in the PSR register. If the S bit in the PSR is 0, SP 
refers to SPO. If the S bit in the PSR is 1 then SP refers to 
SP1. 

Stacks in the Series 32000 family grow downward in memo- 
ry. A Push operation pre-decrements the Stack Pointer by 
the operand length. A Pop operation post-increments the 
Stack Pointer by the operand length. 

FP — Frame Pointer. The FP register is used by a procedure 
to access parameters and local variables on the stack. The 
FP register is set up on procedure entry with the ENTER 
instruction and restored on procedure termination with the 
EXIT instruction. 

The frame pointer holds the address in memory occupied by 
the old contents of the frame pointer. 

SB— Static Base. The SB register points to the global vari- 
ables of a software module. This register is used to support 
relocatable global variables for software modules. The SB 
register holds the lowest address in memory occupied by 
the global variables of a module. 

INTBASE— Interrupt Base. The INTBASE register holds 
the address of the dispatch table for interrupts and traps 
(Section 3.2.1). 

MOD — Module. The MOD register holds the address of the 
module descriptor of the currently executing software mod- 
ule. The MOD register is 16 bits long, therefore the module 
table must be contained within the first 64 kbytes of memo- 
ry- 

2.1.3 Processor Status Register 

The Processor Status Register (PSR) holds status informa- 
tion for the microprocessor. 


The PSR is sixteen bits long, divided into two eight-bit 
halves. The low order eight bits are accessible to all pro- 
grams, but the high order eight bits are accessible only to 
programs executing in Supervisor Mode. 
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FIGURE 2-2. Processor Status Register (PSR) 


C The C bit indicates that a carry or borrow occurred after 
an addition or subtraction instruction. It can be used with 
the ADDC and SUBC instructions to perform multiple- 
precision integer arithmetic calculations. It may have a 
setting of 0 (no carry or borrow) or 1 (carry or borrow). 

T The T bit causes program tracing. If this bit is set to 1 , a 
TRC trap is executed after every instruction (Section 
3.7.6). 

L The L bit is altered by comparison instructions. In a com- 
parison instruction the L bit is set to “1” if the second 
operand is less than the first operand, when both oper- 
ands are interpreted as unsigned integers. Otherwise, it 
is set to “0”. In Floating-Point comparisons, this bit is 
always cleared. 

K Reserved for use by the CPU. 

J Reserved for use by the CPU. 

F The F bit is a general condition flag, which is altered by 
many instructions (e.g., integer arithmetic instructions 
use it to indicate overflow). 

Z The Z bit is altered by comparison instructions. In a com- 
parison instruction the Z bit is set to “1” if the second 
operand is equal to the first operand; otherwise it is set 
to “0”. 

N The N bit is altered by comparison instructions. In a 
comparison instruction the N bit is set to "1” if the sec- 
ond operand is less than the first operand, when both 
operands are interpreted as signed integers. Otherwise, 
it is set to “0”. 

U If the U bit is “1” no privileged instructions may be exe- 
cuted. If the U bit is “0” then all instructions may be 
executed. When U = 0 the processor is said to be in Su- 
pervisor Mode; when U = 1 the processor is said to be in 
User Mode. A User Mode program is restricted from exe- 
cuting certain instructions and accessing certain regis- 
ters which could interfere with the operating system. For 
example, a User Mode program is prevented from 
changing the setting of the flag used to indicate its own 
privilege mode. A Supervisor Mode program is assumed 
to be a trusted part of the operating system, hence it has 
no such restrictions. 

S The S bit specifies whether the SPO register or SP1 reg- 
ister is used as the Stack Pointer. The bit is automatical- 
ly cleared on interrupts and traps. It may have a setting 
of 0 (use the SPO register) or 1 (use the SP1 register). 

P The P bit prevents a TRC trap from occurring more than 
once for an instruction (Section 3.7.6). It may have a 
setting of 0 (no trace pending) or 1 (trace pending). 

I If 1 = 1, then all interrupts will be accepted. If 1 = 0, only 
the NMI interrupt is accepted. Trap enables are not af- 
fected by this bit. 
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2.0 Architectural Description (Continued) 

B Reserved for use by the CPU. This bit is set to 1 during 
the e xecution of the EXTBLT instruction and causes the 
BPU signal t o bec ome active. Upon reset, B is set to 
zero and the BPU signal is set high. 

Not# 1: When an interrupt is acknowledged, the B, I, P, S and U bits are set 
to zero and the BPU signal Is set high. A return from Interrupt will 
restore the original values from the copy of the PSR register saved 
in the Interrupt stack. 

Note 2: If BITBLT (BB) instructions are executed In an interrupt routine, the 
PSR bits J and K must be cleared first. 

2.1.4 Configuration Register 

The Configuration Register (CFG) is 8 bits wide, of which 
four bits are implemented. The implemented bits are used to 
declare the presence of certain external devices and to se- 
lect the clock scaling factor. CFG is programmed by the 
SETCFG instruction. The format of CFG is shown in Figure 
2-3. The various control bits are described below. 
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FIGURE 2-3. Configuration Register (CFG) 

I Interrupt vectoring. This bit controls whether maskable 
interrupts are handled in nonvectored (1 = 0) or vectored 
(1 = 1) mode. Refer to Section 3.2.3 for more information. 
F Floating-point instruction set. This bit indicates whether 
a floating-point unit (FPU) is present to execute floating- 
point instructions. If this bit is 0 when the CPU executes 
a floating-point instruction, a Trap (UND) occurs. If this 
bit is 1 , then the CPU transfers the instruction and any 
necessary operands to the FPU using the slave-proces- 
sor protocol described in Section 3. 1.4.1. 

M Clock scaling. This bit is used in conjuction with the C bit 
to select the clock scaling factor. 

C Clock scaling. Same as the M bit above. Refer to Sec- 
tion 3.2.1 on “Power Save Mode” for details. 

2.2 MEMORY ORGANIZATION 

The main memory of the NS32CG16 is a uniform linear ad- 
dress space. Memory locations are numbered sequentially 
starting at zero and ending at 2 24 -1. The number specify- 
ing a memory location is called an address. The contents of 
each memory location is a byte consisting of eight bits. Un- 
less otherwise noted, diagrams in this document show data 
stored in memory with the lowest address on the right and 
the highest address on the left. Also, when data is shown 
vertically, the lowest address is at the top of a diagram and 
the highest address at the bottom of the diagram. When bits 
are numbered in a diagram, the least significant bit is given 
the number zero, and is shown at the right of the diagram. 
Bits are numbered in increasing significance and toward the 
left. 

_7 0_ 

A 

Byte at Address A 

Two contiguous bytes are called a word. Except where not- 
ed, the least significant byte of a word is stored at the lower 
address, and the most significant byte of the word is stored 
at the next higher address. In memory, the address of a 
word is the address of its least significant byte, and a word 
may start at any address. 
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Word at Address A 


Two contiguous words are called a double-word. Except 
where noted, the least significant word of a double-word is 
stored at the lowest address and the most significant word 
of the double-word is stored at the address two higher. In 
memory, the address of a double-word is the address of its 
least significant byte, and a double-word may start at any 
address. 
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Although memory is addressed as bytes, it is actually orga- 
nized as words. Therefore, words and double-words that are 
aligned to start at even addresses (multiples of two) are 
accessed more quickly than words and double-words that 
are not so aligned. 

2.2.1 Dedicated Tables 

Two of the NS32CG16 dedicated registers (MOD and INT- 
BASE) serve as pointers to dedicated tables in memory. 
The INTBASE register points to the Interrupt Dispatch and 
Cascade tables. These are described in Section 3.8. 

The MOD register contains a pointer into the Module Table, 
whose entries are called Module Descriptors. A Module De- 
scriptor contains four pointers, three of which are used by 
the NS32CG16. The MOD register contains the address of 
the Module Descriptor for the currently running module. It is 
automatically updated by the Call External Procedure in- 
structions (CXP and CXPD). 

The format of a Module Descriptor is shown in Figure 2-4. 
The Static Base entry contains the address of static data 
assigned to the running module. It is loaded into the CPU 
Static Base register by the CXP and CXPD instructions. The 
Program Base entry contains the address of the first byte of 
instruction code in the module. Since a module may have 
multiple entry points, the Program Base pointer serves only 
as a reference to find them. 
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FIGURE 2-4. Module Descriptor Format 
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2.0 Architectural Description (Continued) 

The Link Table Address points to the Link Table for the 
currently running module. The Link Table provides the infor- 
mation needed for: 

1) Sharing variables between modules. Such variables 
are accessed through the Link Table via the External 
addressing mode. 

2) Transferring control from one module to another. This 
is done via the Call External Procedure (CXP) instruc- 
tion. 

The format of a Link Table is given in Figure 2-5. A Link 
Table Entry for an external variable contains the 32-bit ad- 
dress of that variable. An entry for an external procedure 
contains two 16-bit fields: Module and Offset. The Module 
field contains the new MOD register contents for the mod- 
ule being entered. The Offset field is an unsigned number 
giving the position of the entry point relative to the new 
module’s Program Base pointer. 

For further details of the functions of these tables, see the 
Series 32000 Instruction Set Reference Manual. 
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FIGURE 2-5. A Sample Link Table 


2.3 INSTRUCTION SET 

2.3.1 General Instruction Format 

Figure 2-6 shows the general format of a Series 32000 in- 
struction. The Basic Instruction is one to three bytes long 
and contains the Opcode and up to two 5-bit General Ad- 
dressing Mode (“Gen”) fields. Following the Basic Instruc- 
tion field is a set of optional extensions, which may appear 
depending on the instruction and the addressing modes se- 
lected. 

Index Bytes appear when either or both Gen fields specify 
Scaled Index. In this case, the Gen field specifies only the 
Scale Factor (1, 2, 4 or 8), and the Index Byte specifies 
which General Purpose Register to use as the index, and 
which addressing mode calculation to perform before index- 
ing. See Figure 2-7. 


Following Index Bytes come any displacements (addressing 
constants) or immediate values associated with the select- 
ed addressing modes. Each Disp/lmm field may contain 
one of two displacements, or one immediate value. The size 
of a Displacement field is encoded within the top bits of that 
field, as shown in Figure 2-6, with the remaining bits inter- 
preted as a signed (two’s complement) value. The size of an 
immediate value is determined from the Opcode field. Both 
Displacement and Immediate fields are stored most-signifi- 
cant byte first. Note that this is different from the memory 
representation of data (Section 2.2). 

Some instructions require additional “implied” immediates 
and/or displacements, apart from those associated with ad- 
dressing modes. Any such extensions appear at the end of 
the instruction, in the order that they appear within the list of 
operands in the instruction definition (Section 2.3.3). 
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FIGURE 2-7. Index Byte Format 


2.3.2 Addressing Modes 

The NS32CG16 CPU generally accesses an operand by cal- 
culating its Effective Address based on information avail- 
able when the operand is to be accessed. The method to be 
used in performing this calculation is specified by the pro- 
grammer as an “addressing mode.” 

Addressing modes in the NS32CG16 are designed to opti- 
mally support high-level language accesses to variables. In 
nearly all cases, a variable access requires only one ad- 
dressing mode, within the instruction that acts upon that 
variable. Extraneous data movement is therefore minimized. 
NS32CG16 Addressing Modes fall into nine basic types: 
Register: The operand is available in one of the eight Gen- 
eral Purpose Registers. In certain Slave Processor instruc- 
tions, an auxiliary set of eight registers may be referenced 
instead. 

Register Relative: A General Purpose Register contains an 
address to which is added a displacement value from the 
instruction, yielding the Effective Address of the operand in 
memory. 
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FIGURE 2-6. General Instruction Format 
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2.0 Architectural Description (Continued) 

Memory Space: Identical to Register Relative above, ex- 
cept that the register used is one of the dedicated registers 
PC, SP, SB or FP. These registers point to data areas gen- 
erally needed by high-level languages. 

Memory Relative: A pointer variable is found within the 
memory space pointed to by the SP, SB or FP register. A 
displacement is added to that pointer to generate the Effec- 
tive Address of the operand. 


7 

Byte Displacement: Range -64 to +63 
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SIGNED DISPLACEMENT 

Word Displacement: Range -8192 to +8191 
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Immediate: The operand is encoded within the instruction. 
This addressing mode is not allowed if the operand is to be 
written. 

Absolute: The address of the operand is specified by a 
displacement field in the instruction. 

External: A pointer value is read from a specified entry of 
the current Link Table. To this pointer value is added a dis- 
placement, yielding the Effective Address of the operand. 
Top of Stack: The currently-selected Stack Pointer (SPO or 
SP1) specifies the location of the operand. The operand is 
pushed or popped, depending on whether it is written or 
read. 

Scaled Index: Although encoded as an addressing mode, 
Scaled Indexing is an option on any addressing mode ex- 
cept Immediate or another Scaled Index. It has the effect of 
calculating an Effective Address, then multiplying any Gen- 
eral Purpose Register by 1, 2, 4 or 8 and adding into the 
total, yielding the final Effective Address of the operand. 
Table 2-1 is a brief summary of the addressing modes. For a 
complete description of their actions, see the Series 32000 
Instruction Set Reference Manual. 

In addition to the general modes, Register-Indirect with 
auto-increment/decrement and warps or pitch are available 
on several of the graphics instructions. 
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FIGURE 2-8. Displacement Encodings 
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2.0 Architectural Description (continued) 



TABLE 2-1. NS32CG16 Addressing Modes 


ENCODING 

MODE 

ASSEMBLER SYNTAX 

EFFECTIVE ADDRESS 

Register 

00000 

Register 0 

RO or FO 

None: Operand is in the specified 

00001 

Register 1 

R1 or FI 

register. 

00010 

Register 2 

R2 or F2 


00011 

Register 3 

R3 or F3 


00100 

Register 4 

R4 or F4 


00101 

Register 5 

R5orF5 


00110 

Register 6 

R6 or F6 


00111 

Register 7 

R6 or F7 


Register Relative 
01000 

Register 0 relative 

disp(RO) 

Disp + Register. 

01001 

Register 1 relative 

disp(RI) 


01010 

Register 2 relative 

disp(R2) 


01011 

Register 3 relative 

disp(R3) 


01100 

Register 4 relative 

,disp(R4) 


01101 

Register 5 relative 

disp(R5) 


OHIO 

Register 6 relative 

disp(R6) 


01111 

Register 7 relative 

disp(R7) 


Memory Relative 

10000 

Frame memory relative 

disp2(disp1 (FP)) 

Disp2 + Pointer; Pointer found at 

10001 

Stack memory relative 

disp2(disp1 (SP)) 

address Disp 1 + Register. “SP” 

10010 

Static memory relative 

disp2(disp1 (SB)) 

is either SPO or SP1 , as selected 
in PSR. 

Reserved 

10011 

Immediate 

(Reserved for Future Use) 



10100 

Immediate 

value 

None: Operand is input from 
instruction queue. 

Absolute 

10101 

External 

Absolute 

@disp 

Disp. 

10110 

External 

EXT (displ) + disp2 

Disp2 + Pointer; Pointer is found 
at Link Table Entry number Displ . 

Top Of Stack 
10111 

Top of stack 

TOS 

Top of current stack, using either 

User or Interrupt Stack Pointer, 
as selected in PSR. Automatic 
Push/Pop included. 

Memory Space 
11000 

Frame memory 

disp(FP) 

Disp + Register; “SP” is either 

11001 

Stack memory 

disp(SP) 

SPO or SP1 , as selected in PSR. 

11010 

Static memory 

disp(SB) 


11011 

Scaled Index 

Program memory 

*+ disp 


11100 

Index, bytes 

mode[Rn:B] 

EA (mode) + Rn. 

11101 

Index, words 

mode[Rn:W] 

EA (mode) + 2xRn. 

11110 

Index, double words 

mode[Rn:D] 

EA (mode) + 4xRn. 

11111 

Index, quad words 

mode[Rn:Q] 

EA (mode) + 8xRn. 

"Mode” and “n” are contained 
within the Index Byte. 

EA (mode) denotes the effective 
address generated using mode. 
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2.0 Architectural Description (continued) 

2.3.3 Instruction Set Summary 


gen = General operand. Any addressing mode can be speci- 

Table 2-2 presents a brief description of the NS32CG16 f' ec *- 

instruction set. The Format column refers to the Instruction short=A 4-bit value encoded within the Basic Instruction 

Format tables (Appendix A). The Instruction column gives (see Appendix A for encodings). 

the instruction as coded in assembly language, and the De- imm^ Implied immediate operand. An 8-bit value appended 

scription column provides a short description of the function after any addressing extensions 

provided by that instruction Further details of the exact op- disp = Displacement (addressing constant): 8, 16 or 32 bits, 

erations performed by each instruction may be found in the am lonnthe lonai 

Series 32000 

Instruction Set Reference Manual and 

the 

NS32CG16 Printer/Display Processor Programmer's Refer- reg-Any General Purpose Register: R0-R7. 

ence. 



areg = Any Processor Register: SP, SB, FP, INTBASE, 

Notations: 



MOD, PSR, US (bottom 8 PSR bits). 

i= Intener lennth suffix- R = Bvtn 


cond = Any condition code, encoded as a 4-bit field within 


W= Word 


the Basic Instruction (see Appendix A for encodings). 


D = Double Word 


f= Floating Point length suffix: F= 

Standard Floating 



L= 

Long Floating 




TABLE 2-2. NS32CG16 Instruction Set Summary 

MOVES 




Format 

Operation 

Operands 

Description 

4 

MOVi 

gen, gen 

Move a value. 

2 

MOVQi 

short, gen 

Extend and move a signed 4-bit constant. 

7 

MOVMi 

gen,gen,disp 

Move multiple: disp bytes (1 to 16). 

7 

MOVZBW 

gen, gen 

Move with zero extension. 

7 

MOVZiD 

gen, gen 

Move with zero extension. 

7 

MOVXBW 

gen, gen 

Move with sign extension. 

7 

MOVXiD 

gen, gen 

Move with sign extension. 

4 

ADDR 

gen, gen 

Move effective address. 

INTEGER ARITHMETIC 



Format 

Operation 

Operands 

Description 

4 

ADDi 

gen, gen 

Add. 

2 

ADDQi 

short, gen 

Add signed 4-bit constant. 

4 

ADDCi 

gen, gen 

Add with carry. 

4 

SUBi 

gen, gen 

Subtract. 

4 

SUBCi 

gen, gen 

Subtract with carry (borrow). 

6 

NEGi 

gen, gen 

Negate (2’s complement). 

6 

ABSi 

gen, gen 

Take absolute value. 

7 

MULi 

gen, gen 

Multiply. 

7 

QUOi 

gen, gen 

Divide, rounding toward zero. 

7 

REMi 

gen, gen 

Remainder from QUO. 

7 

DIVi 

gen, gen 

Divide, rounding down. 

7 

MODi 

gen, gen 

Remainder from DIV (Modulus). 

7 

MEIi 

gen, gen 

Multiply to extended integer. 

7 

DEIi 

gen, gen 

Divide extended integer. 

PACKED DECIMAL (BCD) ARITHMETIC 


Format 

Operation 

Operands 

Description 

6 

ADDPi 

gen, gen 

Add packed. 

6 

SUBPi 

gen, gen 

Subtract packed. 
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TABLE 2-2. NS32CG16 Instruction Set Summary (Continued) 

INTEGER COMPARISON 



Format 

Operation 

Operands 

Description 

4 

CMPi 

gen, gen 

Compare. 

2 

CMPQi 

short.gen 

Compare to signed 4-bit constant. 

7 

CMPMi 

gen,gen,disp 

Compare multiple: disp bytes (1 to 16). 

LOGICAL AND BOOLEAN 



Format 

Operation 

Operands 

Description 

4 

ANDi 

gen, gen 

Logical AND. 

4 

ORi 

gen, gen 

Logical OR. 

4 

BICi 

gen, gen 

Clear selected bits. 

4 

XORi 

gen, gen 

Logical exclusive OR. 

6 

COMi 

gen, gen 

Complement all bits. 

6 

NOTi 

gen, gen 

Boolean complement: LSB only. 

2 

Scondi 

gen 

Save condition code (cond) as a Boolean variable of size i. 

SHIFTS 




Format 

Operation 

Operands 

Description 

6 

LSHi 

gen, gen 

Logical shift, left or right. 

6 

ASHi 

gen, gen 

Arithmetic shift, left or right. 

6 

ROTi 

gen, gen 

Rotate, left or right. 

BIT FIELDS 




Bit fields are values in memory that are not aligned to byte boundaries. Examples are PACKED arrays and records used in 

Pascal. “Extract’ 

instructions read and align a bit field. “Insert” instructions write a bit field from an aligned source. 

Format 

Operation 

Operands 

Description 

8 

EXTi 

reg,gen,gen,disp 

Extract bit field (array oriented). 

8 

INSi 

reg,gen,gen,disp 

Insert bit field (array oriented). 

7 

EXTSi 

gen,gen,imm,imm 

Extract bit field (short form). 

7 

INSSi 

gen.gen.imm.imm 

Insert bit field (short form). 

8 

CVTP 

reg.gen.gen 

Convert to bit field pointer. 

ARRAYS 




Format 

Operation 

Operands 

Description 

8 

CHECKi 

reg.gen.gen 

Index bounds check. 

8 

INDEXi 

reg.gen.gen 

Recursive indexing step for multiple-dimensional arrays. 
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TABLE 2-2. NS32CG16 Instruction Set Summary (Continued) 

STRINGS 




String instructions assign specific functions to the General Options on all string instructions are: 

Purpose Registers: 


B (Backward): Decrement string pointers after each 

R4 — Comparison Value 


step rather than incrementing. 

R3 — Translation Table Pointer 


U (Until match): End instruction if String 1 entry matches 

R2 — String 2 Pointer 


R4. 

R1 — String 1 Pointer 


W (While match): End instruction if String 1 entry does not 

RO — Limit Count 



match R4. 




All string instructions end when RO decrements to zero. 

Format 

Operation 

Operands 

Description 

5 

MOVSi 

options 

Move string 1 to string 2. 


MOVST 

options 

Move string, translating bytes. 

5 

CMPSi 

options 

Compare string 1 to string 2. 


CMPST 

options 

Compare, translating string 1 bytes. 

5 

SKPSi 

options 

Skip over string 1 entries. 


SKPST 

options 

Skip, translating bytes for until/while. 

JUMPS AND LINKAGE 



Format 

Operation 

Operands 

Description 

3 

JUMP 

gen 

Jump. 

0 

BR 

disp 

Branch (PC Relative). 

0 

Bcond 

disp 

Conditional branch. 

3 

CASEi 

gen 

Multiway branch. 

2 

ACBi 

short,gen,disp 

Add 4-bit constant and branch if non-zero. 

3 

JSR 

gen 

Jump to subroutine. 

1 

BSR 

disp 

Branch to subroutine. 

1 

CXP 

disp 

Call external procedure 

3 

CXPD 

gen 

Call external procedure using descriptor. 

1 

SVC 


Supervisor call. 

1 

FLAG 


Flag trap. 

1 

BPT 


Breakpoint trap. 

1 

ENTER 

[reg list], disp 

Save registers and allocate stack frame (Enter Procedure). 

1 

EXIT 

[reg list] 

Restore registers and reclaim stack frame (Exit Procedure). 

1 

RET 

disp 

Return from subroutine. 

1 

RXP 

disp 

Return from external procedure call. 

1 

RETT 

disp 

Return from trap. (Privileged) 

1 

RETI 


Return from interrupt. (Privileged) 

CPU REGISTER MANIPULATION 



Format 

Operation 

Operands 

Description 

1 

SAVE 

[reg list] 

Save general purpose registers. 

1 

RESTORE 

[reg list] 

Restore general purpose registers. 

2 

LPRi 

areg.gen 

Load dedicated register. (Privileged if PSR or INTBASE) 

2 

SPRi 

areg.gen 

Store dedicated register. (Privileged if PSR or INTBASE) 

3 

ADJSPi 

gen 

Adjust stack pointer. 

3 

BISPSRi 

gen 

Set selected bits in PSR. (Privileged if not Byte length) 

3 

BICPSRi 

gen 

Clear selected bits in PSR. (Privileged if not Byte length) 

5 

SETCFG 

[option list] 

Set configuration register. (Privileged) 


2-109 


NS32CG16-10/NS32CG16-15 






NS32CG 16-1 0/NS32CG 16-15 


2.0 Architectural Description (continued) 



TABLE 2-2. NS32CG16 Instruction Set Summary (Continued) 

FLOATING POINT 

Format 

Operation 

Operands 

Description 

11 

MOVf 

gen, gen 

Move a floating point value. 

9 

MOVLF 

gen, gen 

Move and shorten a long value to standard. 

9 

MOVFL 

gen, gen 

Move and lengthen a standard value to long. 

9 

MOVif 

gen, gen 

Convert any integer to standard or long floating. 

9 

ROUNDfi 

gen, gen 

Convert to integer fay rounding. 

9 

TRUNCfi 

gen, gen 

Convert to integer by truncating, toward zero. 

9 

FLOORfi 

gen, gen 

Convert to largest integer less than or equal to value. 

11 

ADDf 

gen, gen 

Add. 

11 

SUBf 

gen, gen 

Subtract. 

11 

MULf 

gen, gen 

Multiply. 

11 

DIVf 

gen, gen 

Divide. 

11 

CMPf 

gen, gen 

Compare. 

11 

NEGf 

gen, gen 

Negate. 

11 

ABSf 

gen, gen 

Take absolute value. 

9 

LFSR 

gen 

Load FSR. 

9 

SFSR 

gen 

Store FSR. 

12 

POLYf 

gen, gen 

Polynomial Step. 

12 

DOTf 

gen, gen 

Dot Product. 

12 

SCALBf 

gen, gen 

Binary Scale. 

12 

LOGBf 

gen, gen 

Binary Log. 

MISCELLANEOUS 

Format 

Operation 

Operands 

Description 

1 

NOP 


No operation. 

1 

WAIT 


Wait for interrupt. 

1 

DiA 


Diagnose. Single-byte "Branch to Self” for hardware 
breakpointing. Not for use in programming. 

GRAPHICS 

Format 

Operation 

Operands 

Description 

5 

BBOR 

options* 

Bit-aligned block transfer ‘OR’. 

5 

BBAND 

options 

Bit-aligned block transfer ‘AND’. 

5 

BBFOR 


Bit-aligned block transfer fast ‘OR’. 

5 

BBXOR 

options 

Bit-aligned block transfer ‘XOR’. 

5 

BBSTOD 

options 

Bit-aligned block source to destination. 

5 

BITWT 


Bit-aligned word transfer. 

5 

EXTBLT 

options 

External bit-aligned block transfer. 

5 

MOVMPi 


Move multiple pattern. 

5 

TBITS 

options 

Test bit string. 

5 

SBITS 


Set bit string. 

5 

SBITPS 


Set bit perpendicular string. 

BITS 

Format 

Operation 

Operands 

Description 

4 

TBITi 

gen, gen 

Test bit. 

6 

SBITi 

gen, gen 

Test and set bit. 

6 

SBITIi 

gen, gen 

Test and set bit, interlocked. 

6 

CBITi 

gen, gen 

Test and clear bit. 

6 

CBITIi 

gen, gen 

Test and clear bit, interlocked. 

6 

IBITi 

gen, gen 

Test and invert bit. 

8 

FFSi 

gen, gen 

Find first set bit. 

•Note: Options are controlled by fields of the instruction, PSR status bits, or dedicated register values. 
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2.4 GRAPHICS SUPPORT 

The following sections provide a brief description of the 
NS32CG16 graphics support capabilities. Basic discussions 
on frame buffer addressing and BITBLT operations are also 
provided. More detailed information on the NS32CG16 
graphics support instructions can be found in the 
NS32CG16 Printer/Display Processor Programmer’s Refer- 
ence. 

2.4.1 Frame Buffer Addressing 

There are two basic addressing schemes for referencing 
pixels within the frame buffer: Linear and Cartesian (or x-y). 
Linear addressing associates a single number to each pixel 
representing the physical address of the corresponding bit 
in memory. Cartesian addressing associates two numbers 
to each pixel representing the x and y coordinates of the 
pixel relative to a point in the Cartesian space taken as the 
origin. The Cartesian space is generally defined as having 
the origin in the upper left. A movement to the right increas- 
es the x coordinate; a movement downward increases the y 
coordinate. 

The correspondence between the location of a pixel in the 
Cartesian space and the physical (BIT) address in memory 
is shown in Figure 2-9. The origin of the Cartesian space 
(x=0, y=0) corresponds to the bit address ‘ORG’. Incre- 
menting the x coordinate increments the bit address by one. 
Incrementing the y coordinate increments the bit address by 
an amount representing the warp (or pitch) of the Cartesian 
space. Thus, the linear address of a pixel at location (x, y) in 
the Cartesian space can be found by the following expres- 
sion. 

ADDR = ORG + y * WARP + x 
Warp is the distance (in bits) in the physical memory space 
between two vertically adjacent bits in the Cartesian space. 
Example 1 below shows two NS32CG16 instruction se- 
quences to set a single pixel given the x and y coordinates. 
Example 2 shows how to create a fat pixel by setting four 
adjacent bits in the Cartesian space. 

Example 1: Set pixel at location (x, y) 

Setup: R0 x coordinate 
R1 y coordinate 

Instruction Sequence 1: 

MULD WARP, R1 ; Y*WARP 

ADDD R0, R1 ; + X = BIT OFFSET 

SBITD Rl, ORG ; SET PIXEL 

Instruction Sequence 2: 

INDEXD Rl, (WARP-1), RO ; Y*WARP + X 
SBITD Rl, ORG ; SET PIXEL 


Example 2: Create fat pixel by setting bits at locations 
(x, y), (x+1, y), (x, y+1) and (x+1, y+1). 

Setup: RO x coordinate 
Rl y coordinate 

Instruction Sequence: 

INDEXD Rl, (WARP-1), RO ; BIT ADDRESS 

SBITD 41, ORG ; SET FIRST PIXEL 

ADDQD 1, Rl ; (X+1, Y) 

SBITD Rl, ORG ; SECOND PIXEL 

ADDD (WARP-1), Rl ; (X, Y+1) 

SBITD Rl, ORG ; THIRD PIXEL 

ADDQD 1, Rl ; (X+1, Y+1) 

SBITD Rl, ORG ; LAST PIXEL 


ORG ORG+1 ORG +2 


i i i 



TL/EE/9424-61 

FIGURE 2-9. Correspondence between Linear and 
Cartesian Addressing 

2.4.2 BITBLT Fundamentals 

BITBLT, BIT-aligned BLock Transfer, is a general opera- 
tor that provides a mechanism to move an arbitrary size 
rectangle of an image from one part of the frame buffer 
to another. During the data transfer process a bitwise 
logical operation can be performed between the source 
and the destination data. BITBLT is also called Raster- 
Op: operations on rasters. It defines two rectangular ar- 
eas, source and destination, and performs a logical oper- 
ation (e.g., AND, OR, XOR) between these two areas and 
stores the result back to the destination. It can be ex- 
pressed in simple notation as: 

Source op Destination — > Destination 
op: AND, OR, XOR, etc. 
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2.4.2. 1 Frame Buffer Architecture 

There are two basic types of frame buffer architectures: 
plane-oriented or pixel-oriented. BITBLT takes advantage of 
the plane-oriented frame buffer architecture's attribute of 
multiple, adjacent pixels-per-word, facilitating the movement 
of large blocks of data. The source and destination starting 
addresses are expressed as pixel addresses. The width and 
height of the block to be moved are expressed in terms of 
pixels and scan lines. The source block may start and end 
at any bit position of any word, and the same applies for the 
destination block. 

2.4.2.2 Bit Alignment 

Before a logical operation can be performed between the 
source and the destination data, the source data must first 
be bit aligned to the destination data. In Figure 2-10, the 
source data needs to be shifted three bits to the right in 
order to align the first pixel (i.e., the pixel at the top left 
corner) in the source data block to the first pixel in the desti- 
nation data block. 

2.4.2.3 Block Boundaries and Destination Masks 

Each BITBLT destination scan line may start and end at any 
bit position in any data word. The neighboring bits (bits shar- 
ing the same word address with any words in the destination 
data block, but not a part of the BITBLT rectangle) of the 
BITBLT destination scan line must remain unchanged after 
the BITBLT operation. 


Due to the plane-oriented frame buffer architecture, all 
memory operations must be word-aligned. In order to pre- 
serve the neighboring bits surrounding the BITBLT destina- 
tion block, both a left mask and a right mask are needed for 
all the leftmost and all the rightmost data words of the desti- 
nation block. The left mask and the right mask both remain 
the same during a BITBLT operation. 

The following example illustrates the bit alignment require- 
ments. In this example, the memory data path is 16 bits 
wide. Figure 2-10 shows a 32 pixel by 32 scan line frame 
buffer which is organized as a long bit stream which wraps 
around every two words (32 bits). The origin (top left corner) 
of the frame buffer starts from the lowest word in memory 
(word address 00 (hex)). 

Each word in the memory contains 16 bits, D0-D15. The 
least significant bit of a memory word, DO, is defined as the 
first displayed pixel in a word. In this example, BITBLT ad- 
dresses are expressed as pixel addresses relative to the 
origin of the frame buffer. The source block starting address 
is 021 (hex) (the second pixel in the third word). The desti- 
nation block starting address is 204 (hex) (the fifth pixel in 
the 33rd word). The block width is 13 (hex), and the height is 
06 (hex) (corresponding to 6 scan lines). The shift value is 3. 




WORD BOUNDARIES 


1 


PIXEL NUMBERS 
WITHIN WORDS 
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20 
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WORD 2E 
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3A 
3C 
3E 


FIGURE 2-10. 32-Plxel by 32-Scan Line Frame Buffer 
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(a) (b) 

FIGURE 2-11. Overlapping BITBLT Blocks 


The left mask and the right mask are 0000,1111,1111,1111 
and 1111,1111 ,0000,0000 respectively. 

Nota 1: Zeros In either the left mask or the right mask Indicate the destina- 
tion bits which will not be modified. 

Note 2: The BB(function) and EXTBLT Instructions use different set up pa- 
rameters. and techniques. 

2.4.2.2 BITBLT Directions 

A BITBLT operation moves a rectangular block of data in a 
frame buffer. The operation itself can be considered as a 
subroutine with two nested loops. The loops are proceeded 
by setup operations. In the outer loop the source and desti- 
nation starting addresses are calculated, and the test for 
completion is performed. In the inner loop the actual data 
movement for a single scan line takes place. The length of 
the inner loop is the number of (aligned) words spanned by 
each scan line. The length of the outer loop is equal to the 
height (number of scan lines) of the block to be moved. A 
skeleton of the subroutine representing the BITBLT opera- 
tion follows. 

BITBLT: calculate BITBLT setup parameters; 

(once per BITBLT operation), 
such as 
width, height 

bit misalignment (shift number) 
left, right masks 
horizontal, vertical directions 
etc 


OUTERLOOP: 


INNERLOOP: 


calculate source, dest addresses; 

(once per scanline). 

move data, (logical operation) and incre- 
ment addresses; 

(once per word). 


UNTIL done horizontally 

UNTIL done vertically 

RETURN (from BITBLT). 

Note: In the NS32CG16 only the setup operations must be done by the 
programmer. The Inner and outer loops are automatically executed 
by the BITBLT instructions. 

Each loop can be executed in one of two directions: the 
inner loop from left to right or right to left, the outer loop 
from top to bottom (down) or bottom to top (up). 

The ability to move data starting from any corner of the 
BITBLT rectangle is necessary to avoid destroying the 
BITBLT source data as a result of destination writes when 
the source and destination are overlapped (i.e., when they 
share pixels). This situation is routinely encountered while 
panning or scrolling. 

A determination of the correct execution directions of the 
BITBLT must be performed whenever the source and desti- 
nation rectangles overlap. Any overlap will result in the de- 
struction of source data (from a destination write) if the cor- 
rect vertical direction is not used. Horizontal BITBLT direc- 
tion is of concern only in certain cases of overlap, as will be 
explained below. 

Figures 2- 1 1(a) and (b) illustrate two cases of overlap. Here, 
the BITBLT rectangles are three pixels wide by five scan 
lines high; they overlap by a single pixel in (a) and a single 
column of pixels in (b). For purposes of illustration, the 
BITBLT is assumed to be carried out pixel-by-pixel. This 
convention does not affect the conclusions. 

In Figure 2-1 1(a), if the BITBLT is performed in the UP direc- 
tion (bottom-to-top) one of the transfers of the bottom scan 
line of the source will write to the circled pixel of the destina- 
tion. Due to the overlap, this pixel is also part of the upper- 
most scan line of the source rectangle. Thus, data needed 
later is destroyed. Therefore, this BITBLT must be per- 
formed in the DOWN direction. Another example of this oc- 
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curs any time the screen is moved in a purely vertical direc- 
tion, as in scrolling text. It should be noted that, in both of 
these cases, the choice of horizontal BITBLT direction may 
be made arbitrarily. 

Figure 2-1 1(b) demonstrates a case in which the horizontal 
BITBLT direction may not be chosen arbitrarily. This is an 
instance of purely horizontal movement of data (panning). 
Because the movement from source to destination involves 
data within the same scan line, the incorrect direction of 
movement will overwrite data which will be needed later. In 
this example, the correct direction is from right to left. 

2A.2.5 BITBLT Variations 

The ‘classical’ definition of BITBLT, as described in “Small- 
talk-80 The Language and its Implementation”, by Adele 
Goldberg and David Robson, provides for three operands: 
source, destination and mask/texture. This third operand is 
commonly used in monochrome systems to incorporate a 
stipple pattern into an area. These stipple patterns provide 
the appearance of multiple shades of gray in single-bit-per- 
pixel systems, in a manner similar to the ‘halftone’ process 
used in printing. 

Texture opl Source op2 Destination —*■ Destination 

While the NS32CG16 and the external BPU (if used) are 
essentially two-operand devices, three-operand BITBLT op- 
erations can be implemented quite flexibly and efficiently by 
performing the two operations serially. 

2.4.3 GRAPHICS SUPPORT INSTRUCTIONS 

The NS32CG16 provides eleven instructions for supporting 
graphics oriented applications. These instructions are divid- 
ed into three groups according to the operations they per- 
form. General descriptions for each of them and the related 
formats are provided in the following sections. 

2.4.3.1 BITBLT (BIT-aligned BLock Transfer) 

This group includes seven instructions. They are used to 
move characters and objects into the frame buffer which will 
be printed or displayed. One of the instructions works in 
conjunction with an external BITBLT Processing Unit (BPU) 
to maximize performance. The other six are executed by the 
NS32CG16. 

BIT-aligned BLock Transfer 
Syntax: BB(function) Options 


Setup: 

R0 

base address, source data 


R1 

base address, destination data 


R2 

shift value 


R3 

height (in lines) 


R4 

first mask 


R5 

second mask 


R6 

source warp (adjusted) 


R7 

destination warp (adjusted) 


0(SP) 

width (in words) 

Function: 

AND, OR, XOR, FOR, STOD 

Options: 

IA 

Increasing Address (default option). 
When IA is selected, scan lines are 
transferred in the increasing BIT/BYTE 
order. 


DA 

Decreasing Address. 


S 

True Source (default option). 


-S 

Inverted Source. 


These five instructions perform standard BITBLT operations 
between source and destination blocks. The operations 
available include the following: 


BBAND: 

src 

AND 

dst 


-src 

AND 

dst 

BBOR: 

src 

OR 

dst 


-src 

OR 

dst 

BBXOR: 

src 

XOR 

dst 


-src 

XOR 

dst 

BBFOR: 

src 

OR 

dst 

BBSTOD: 

src 

TO 

dst 


-src 

TO 

dst 


’src’ and ‘-src’ stand for ‘True Source’ and ‘Inverted 
Source’ respectively; ‘dst’ stands for ‘Destination’. 

Note 1: For speed reasons, the BB instructions require the masks to be 
specified with respect to the source block. In Figure 2-10 masking 
was defined relative to the destination block. 

Note 2: The options -S and DA are not available for the BBFOR instruc- 
tion. 

Note 3: BBFOR performs the same operation as BBOR with IA and S op- 
tions. 

Note 4: I A and DA are mutually exclusive and so are S and -S. 

Note 5: The width is defined as the number of words of source data to read. 
Note 6: An odd number of bytes can be specified for the source warp. 
However, word alignment of source scan lines will result in faster 
execution. 

The horizontal and vertical directions of the BITBLT opera- 
tions performed by the above instructions, with the excep- 
tion of BBFOR, are both programmable. The horizontal di- 
rection is controlled by the IA and DA options. The vertical 
direction is controlled by the sign of the source and destina- 
tion warps. Figure 2-12 and Table 2-3 show the format of 
the BB instructions and the encodings for the ‘op’ and ‘i’ 
fields. 


23 16 

15 8 

7 0 

j 1 l I | | l | | 

000000DXS0 

l v \ 

op 

i 

i i i i i i i 

0 0 0 0 1 1 1 0 


• D is set when the DA option is selected 

• S is set when the -S option is selected 

• X is set for BBAND, and it is clear for all other BB instructions 


FIGURE 2-12. BB Instructions Format 


TABLE 2-3. ‘op’ and T Field Encodings 


Instruction 

Options 

‘op’ Field 

‘I’ Field 

BBAND 

Yes 

1010 

11 

BBOR 

Yes 

0110 

01 

BBXOR 

Yes 

1110 

01 

BBFOR 

No 

1100 

01 

BBSTOD 

Yes 

0100 

01 


BIT-aligned Word Transfer 
Syntax: BITWT 

Setup: R0 Base address, source word 

R1 Base address, destination double word 
R2 Shift value 

The BITWT instruction performs a fast logical OR operation 
between a source word and a destination double word, 
stores the result into the destination double word and incre- 
ments registers R0 and R1 by two. Before performing the 
OR operation, the source word is shifted left (i.e., in the 
direction of increasing bit numbers) by the value in register 
R2. 
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2.0 Architectural Description (Continued) 

This instruction can be used within the inner loop of a block 
OR operation. Its use assumes that the source data is 
‘clean’ and does not need masking. The BITWT format is 
shown in Figure 2-13. 


23 

16 15 

8 7 

0 

0 0 

1 1 1 1 1 1 I 1 

000000001 

— i — i — i — i — — i — i — i — i — 

0000100001 

7 1 

1 1 0 


FIGURE 2-13. BITWT Instruction Format 


External BITBLT 
Syntax: EXTBLT 

Setup: R0 base addresses, source data 

R1 base address, destination data 

R2 width (in bytes) 

R3 height (in lines) 

R4 horizontal increment/decrement 

R5 temporary register (current width) 

R6 source warp (adjusted) 

R7 destination warp (adjusted) 

Not# 1: RO and R1 are updated after execution to point to the last source 
and destination addresses plu3 related warps. R2, R3 and R5 will 
be modified. R4, R6, and R7 are returned unchanged. 

Note 2: Source and destination pointers should point to word-aligned oper- 
ands to maximize speed and minimize external Interface logic. 

This instruction performs an entire BITBLT operation in con- 
junction with an external BITBLT Processing Unit (BPU). 
The external BPU Control Register should be loaded by the 
software before the instruction is executed (refer to the 
DP8510 or DP851 1 data sheets for more information on the 
BPU). The NS32CG16 generates a series of source read, 
destination read and destination write bus cycles until the 
entire data block has been transferred. The BITBLT opera- 
tion can be performed in either horizontal direction. As con- 
trolled by the sign of the contents of register R4. 
Depending on the relative alignment of the source and des- 
tination blocks, an extra source read may be required at the 
beginning of each scan line, to load the pipeline register in 
the external BPU. The L bit in the PSR register determines 
whether the extra source read is performed. If L is 1, no 
extra read is performed. The instructions CMPQB 2,1 or 
CMPQB 1,2 could be executed to provide the right setting 
for the L bit just before executing EXTBLT. Figure 2-14 
shows the EXTBLT format. The bus activity for a simple 
BITBLT operation is shown in Figure 2-19. 


23 

15 

8 7 

0 

0 0 

i — i — i — i i i i i — i — i — 

000000000101 

— i — i — i — I — I — 

1 1 0 0 0 0 1 

1 1 

1 1 0 


FIGURE 2-14. EXTBLT Instruction Format 


B.3.2 Pattern Fill 

Only one instruction is in this group. It is usually used for 
clearing RAM and drawing patterns and lines. 

Move Multiple Pattern 
Syntax: MOVMPi 

Setup: RO base address of the destination 

R1 pointer increment (in bytes) 

R2 number of pattern moves 

R3 source pattern 

Note: R1 and R3 are not modified by the instruction. R2 will always be 
returned as zero. RO is modified to reflect the last address into which 
a pattern was written. 


This instruction stores the pattern in register R3 into the 
destination area whose address is in register RO. The pat- 
tern count is specified in register R2. After each store oper- 
ation the destination address is changed by the contents of 
register R1. This allows the pattern to be stored in rows, in 
columns, and in any direction, depending on the value and 
sign of R1. The MOVMPi instruction format is shown in Fig- 
ure 2-15. 


23 

15 

8 

7 

0 

O 

o 

o 

o 

o 

o 

— 1 — 1 — 1 — I — 1 — 1 

0 0 0 0 0 1 1 1 

1 

i 

1 1 1 1 

0 0 0 0 1 

1 1 

1 1 0 


FIGURE 2-15. MOVMPI Instruction Format 


B.3.3 Data Compression, Expansion and Magnify 

The three instructions in this group can be used to com- 
press data and restore data from compression. A com- 
pressed character set may require from 30% to 50% less 
memory space for its storage. 

The compression ratio possible can be 50:1 or higher de- 
pending on the data and algorithm used. TBITS can also be 
used to find boundaries of an object. As a character is need- 
ed, the data is expanded and stored in a RAM buffer. The 
expand instructions (SBITS, SBITPS) can also function as 
line drawing instructions. 

Test Bit String 
Syntax: TBITS option 

Setup: RO base address, source (byte address) 

R1 starting source bit offset 

R2 destination run length limited code 

R3 maximum value run length limit 

R4 maximum source bit offset 

Option: 1 count set bits until a clear bit is found 

0 count clear bits until a set bit is found 

Note: RO, R3 and R4 are not modified by the instruction execution. R1 
reflects the new bit offset. R2 holds the result. 

This instruction starts at the base address, adds a bit offset, 
and tests the bit for clear if "option” = 0 (and for set if 
“option” = 1). If clear (or set), the instruction increments to 
the next higher bit and tests for clear (or set). This testing 
for clear proceeds through memory until a set bit is found or 
until the maximum source bit offset or maximum run length 
value is reached. The total number of clear bits is stored in 
the destination as a run length value. 

When TBITS finds a set bit and terminates, the bit offset is 
adjusted to reflect the current bit address. Offset is then 
ready for the next TBITS instruction with “option” = 0. After 
the instruction is executed, the F flag is set to the value of 
the bit previous to the bit currently being pointed to (i.e., the 
value of the bit on which the instruction completed execu- 
tion). In the case of a starting bit offset exceeding the maxi- 
mum bit offset (R1 £ R4), the F flag is set if the option was 
1 and clear if the option was 0. The L flag is set when the 
desired bit is found, or if the run length equalled the maxi- 
mum run length value and the bit was not found. It is cleared 
otherwise. Figure 2-16 shows the TBITS instruction format. 


23 

15 

8 7 

0 

o 

o 

o 

1 1 1 1 1 1 1 1 1 

00000S01001 

1 I I I 1 

1 1 0 0 0 0 1 

1 1 

1 1 0 


• S is set for ‘TBITS 1’ and clear for 'TBITS O'. 


FIGURE 2-16. TBITS Instruction Format 
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2.0 Architectural Description (Continued) 

Set Bit String 


Syntax: SBITS 
Setup: R0 base address of the destination 

R1 starting bit offset (signed) 

R2 number of bits to set (unsigned) 

R3 address of string look-up table 

Note: When the Instruction terminates, the registers are returned un- 
changed. 

SBITS sets a number of contiguous bits in memory to 1 , and 
is typically used for data expansion operations, The instruc- 
tion draws the number of ones specified by the value in R2, 
starting at the bit address provided by registers R0 and R1. 
In order to maximize speed and allow drawing of patterned 
lines, an external 1 k byte lookup table is used. The lookup 
table is specified in the NS32CG16 Printer/Display Proces- 
sor Programmer’s Reference Supplement. 

When SBITS begins executing, it compares the value in R2 
with 25. If the value in R2 is less than or equal to 25, the F 
flag is cleared and the appropriate number of bits are set in 
memory. If R2 is greater than 25, the F flag is set and no 
other action is performed. This allows the software to use a 
faster algorithm to set longer strings of bits. Figure 2-17 
shows the SBITS instruction format. 


Set BIT Perpendicular String 
Syntax: SBITPS 

Setup: R0 base address, destination (byte address) 

R1 starting bit offset 

R2 number of bits to set 

R3 destination warp (signed value, in bits) 

Note: When the instruction terminates, the R0 and R3 registers are re- 
turned unchanged. R1 becomes the final bit offset. R2 is zero. 

The SBITPS can be used to set a string of bits in any direc- 
tion. This allows a font to be expanded with a 90 or 270 
degree rotation, as may be required in a printer application. 
SBITPS sets a string of bits starting at the bit address speci- 
fied in registers R0 and R1. The number of bits in the string 
is specified in R2. After the first bit is set, the destination 
warp is added to the bit address and the next bit is set. The 
process is repeated until all the bits have been set. A nega- 
tive raster warp offset value leads to a 90 degree rotation. A 
positive raster warp value leads to a 270 degree rotation. If 
the R3 value is = (space warp +1 or - 1), then the result is 
a 45 degree line. If the R3 value is + 1 or -1, a horizontal 
line results. 

SBITS and SBITPS allow expansion on any 90 degree an- 
gle, giving portrait, landscape and mirror images from one 
font. Figure 2-18 shows the SBITPS instruction format. 


0 0000000001 101 1 100001 1 1 0 


FIGURE 2-17. SBITS Instruction Format 


000000000010111100001110 


FIGURE 2-18. SBITPS Instruction Format 



WORD 1 (12 CLOCKS) 


WORD 2 (12 CLOCKS) 


WORD 3 (12 CLOCKS) 


WORD 4 (12 CLOCKS) 

TL/EE/9424-66 


FIGURE 2-19. Bus Activity for a Simple BITBLT Operation 
Note 1: This example is for a block 4 words wide and 1 line high. 

Note 2: The sequence is common with all logical operations of the DP8510/DP8511 BPU. 

Note 3: Mask values, shift values and number of bit planes do not affect the performance. 

Note 4: Zero wait states are assumed throughout the BITBLT operation. 

Note 5: The extra read is performed when the BPU pipeline register needs to be preloaded. 
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2.0 Architectural Description (Continued) 

B.3.3.1 Magnifying Compressed Data 

Restoring data is just one application of the SBITS and 
SBITPS instructions. Multiplying the “length” operand used 
by the SBITS and SBITPS instructions causes the resulting 
pattern to be wider, or a multiple of “length”. 

As the pattern of data is expanded, it can be magnified by 

2x, 3x, 4x lOx and so on. This creates several sizes of 

the same style of character, or changes the size of a logo. A 
magnify in both dimensions X and Y can be accomplished 
by drawing a single line, then using the MOVS (Move String) 
or the BB instructions to duplicate the line, maintaining an 
equal aspect ratio. 

More information on this subject is provided in the 
NS32CG16 Printer/Display Processor Programmer’s Refer- 
ence Supplement. 

3.0 Functional Description 

3.1 POWER AND GROUNDING 

The NS32CG16 requires a single 5-Volt power supply, ap- 
plied on 5 pins. The logic voltage pin (Vccl) supplies the 
power to the on-chip logic. The buffer voltage pins 
VCCCTTL, VCCFCLK, VCCAD, and VCCIO supply the pow- 
er to the on-chip output drivers. 

Grounding connections are made on 6 pins. Jhe Logic 
Ground Pin (VSSL) provides the ground connection to the 
on-chip logic. The buffer ground pins VSSFCLK, VSSNTSO, 
VSSHAD, VSSLAD, VSSIO are the ground pins for the on- 
chip output drivers. 

For optimal noise immunity, the power and ground pins 
should be connected to Vcc and ground planes respective- 
ly. If Vcc and ground planes are not used, single conductors 
should be run directly from each Vcc P in t0 a power point, 
and from each GND pin to a ground point. Daisy-chained 
connections should be avoided. 

Decoupling capacitors should also be used to keep the 
noise level to a minimum. Standard 0.1 ju,F ceramic capaci- 
tors can be used for this purpose. In addition, a 1.0 jxF 
tantalum capacitor should be connected between Vccl and 
ground. They should attach to Vcc> Vss pairs as close as 
possible to the NS32CG16. 

During prototype using wire-wrap or similar methods, the 
capacitors should be soldered directly to the power pins of 
the NS32CG16 socket, or as close as possible, with very 
short leads. 

Recommended bypass for production in printed circuit 
boards: 


+ 5 

Ground 

Capacitors 

VCCL 

VSSL 

0.1 juF Disk Ceramic 
1.0 /xFTantulum 

VCCIO 

VSSIO 

0.1 /aF 

VCCCTTL 

VSSNTSO 

0.1 /aF 

VCCAD 

VSSLAD 

0.1 jaF 

VCCAD 

VSSHAD 

None 

VCCFCLK 

VSSFCLK 

0.1 jaF 


VCCL-VSSL bypass requires a very short lead length and 
low inductance on the 0.1 juF capacitor. 

Design Notes 

When constructing a board using high frequency clocks with 
multiple lines switching, special care should be taken to 


avoid resonances on signal lines. A separate power and 
ground layer is recommended. This is true when designing 
boards for the NS32CG16. Switching times of under 5 ns on 
some lines are possible. Resonant frequencies should be 
maintained well above the 200 MHz frequency range on 
signal paths by keeping traces short and inductance low. 
Loading capacitance at the end of a transmission line con- 
tributes to the resonant frequency and should be minimized 
if possible. Capacitors should be located as close as possi- 
ble across each power and ground pair near the 
NS32CG16. 

Power and ground connections are shown in Figure 3-1. 

+5V 

9 


4 OTHER VCC 

/ 4 » CONNECTIONS 

(VCC PLANE) 


5 OTHER GROUND 

/ 4 » CONNECTIONS 

(GND PLANE) 


TL/EE/9424-7 

FIGURE 3-1. Power and Ground Connections 
3.2 CLOCKING 

The NS32CG16 provides an internal oscillator that interacts 
with an external clock source through two signals; OSCIN 
and OSCOUT. 

Either an external single-phase clock signal or a crystal can 
be used as the clock source. If a single-phase clock source 
is used, only the connection on OSCIN is required; 
OSCOUT should be left unconnected or loaded with no 
more than 5 pF of stray capacitance. The voltage level re- 
quirements specified in Section 4.3 must also be met for 
proper operation. 

When operation with a crystal is desired, a fundamental 
mode crystal should be used. In this case, special care 
should be taken to minimize stray capacitances and induc- 
tances, especially when operating at a crystal frequency of 
30 MHz. The crystal, as well as the external RC compo- 
nents, should be placed in close proximity to the OSCIN and 
OSCOUT pins to keep the printed circuit trace lengths to an 
absolute minimum. Figure 3-2 shows the external crystal 
interconnections. Table 3-1 provides the crystal characteris- 
tics and the values of the RC components, including stray 
capacitance, required for various frequencies. 

OSCIN 


OSCOUT 


TL/EE/9424-8 

FIGURE 3-2. Crystal Interconnections 



VCCL 

VCCCTTL, 

VCCFCLK, 

VCCAD, 

VCCIO 


NS32CG16 

CPU 


VSSL 


VSSFCLK, 

VSSNTSO, 

VSSHAD, 

VSSLAD, 

VSSIO 
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3.0 Functional Description (Continued) 

TABLE 3-1. External Oscillator Specifications 
Crystal Characteristics 

Type At-Cut 

Tolerance 0.005% at 25°C 

Stability 0.01 % from 0°C to 70°C 

Resonance Fundamental (parallel) 

Capacitance 20 pF 

Maximum Series Resistance 50ft 


R and C Values 

Frequency 

R1 

R2 

Cl 

C2 

(MHz) 

(kft) 

(ft) 

(PF) 

(PF) 

12 

470 

120 

20 

20 

16 

360 

100 

20 

20 

20 

270 

75 

20 

20 

25 

220 

68 

20 

20 

30 

180 

51 

20 

20 


3.2.1 Power Save Mode 

The NS32CG16 provides a power save feature that can be 
used to significantly reduce the power consumption at times 
when the computational demand decreases. The device 
uses the clock signal at the OSCIN pin to derive the internal 
clock as well as the external signals PHI1, PHI2, CTTL and 
FCLK. The frequency of all these clock signals is affected 
by the clock scaling factor. Scaling factors of 1 , 2, 4 or 8 can 
be selected by properly setting the C and M bits in the CFG 
register. The power save mode should not be used to re- 
duce the clock frequency below the minimum frequency re- 
quired by the CPU. 

Upon reset, both C and M are set to zero, thus maximum 
clock rate is selected. 

Due to the fact that the C and M bits are programmed by the 
SETCFG instruction, the power save feature can only be 
controlled by programs running in supervisor mode. 

The following table shows the C and M bit settings for the 
various scaling factors, and the resulting supply current for a 
crystal frequency of 30 MHz. 

Clock Scaling Factor vs Supply Current 


C 

M 

Scaling 

Factor 

CPU Clock 
Frequency 

Typical Ice 
at +5V 

0 

0 

1 

15 MHz 

140 mA 

0 

1 

2 

7.5 MHz 

76 mA 

1 

0 

4 

3.75 MHz 

42 mA 

1 

1 

8 

1.88 MHz 

25 mA 


3.3 RESETTING 

The RSTI inp ut pin is used to reset the NS32CG16. The 
CPU samples RSTI on the falling edge of CTTL. 

Whenever a low level is detected, the CPU responds imme- 
diately. Any instruction being executed is terminated; any 
results that have not yet been written to memory are dis- 
carded; and any pending interrupts and traps are e liminated. 
The internal latch for the edge-sensitive NMI signal is 
cleared. 

On application of power, RSTI must be held low for at least 
50 /as after Vqc is stable. This is to ensure that all on-chip 
voltages are completely stable before operation. Whenever 
a Reset is applied, it must also remain active for not less 
than 64 CTTL cycles. See Figures 3-3 and 3-4. 

Whilejn t he Reset state, the C PU dr ives the signals ADS, 
RD, WR, DBE, TSO, BPU, and DDIN inactive. AD0-AD15, 
A16-A23 and SPC are floated, and the state of all other 
output signals is undefined. 

The internal CPU clock, PHI1, PHI2 and CTTL all run at half 
the frequency of the signal on the OSCIN pin. FCLK runs at 
the same frequency of OSCIN. 

The HOLD signal must be kept inactive. After the RSTI sig- 
nal is driven high, the CPU will stay in the reset condition for 
approximately 8 clock cycles and then it will begin execution 
at address 0. 

The PSR is reset to 0. The CFG C and M bits are reset to 0. 
NMI is enabled to allow Non-Maskable Interrupts. The fol- 
lowing conditions are present after reset due to the PSR 
being reset to 0: 

Tracing is disabled. 

Supervisor mode is enabled. 

Supervisor stack space is used when the TOS addressing 
mode is indicated. 

No tr ace traps are pending. 

Only NMI is enabled. INT is not enabled. 

BPU is inactive high. 

The Clock Scaling Factor is set to 1, refer to Section 3.2.1. 
Note that vector/non-vectored interrupts have not been se- 
lected. While interrupts are disabled, a SETCFG [I] instruc- 
tion must be executed to declare the presence of the 
NS32202 if vectored interrupts are desired. If non-vectored 
interrupts are required, a SETCFG without the [I] must be 
executed. 

The presence/absence of the NS32081 or NS32381 has 
also not been declared. If there is a Floating Point Unit, a 
SETCFG [F] instruction must be executed. If there is no 
floating point unit, a SETCFG without the [F] must be exe- 
cuted. 



EXTERNAL RESET 
(OPTIONAL) 


RESET SWITCH 
(OPTIONAL) 


RSTI RSTOl — ►System RESET 


FIGURE 3-2a. Recommended Reset Connections 
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3.0 Functional Description (Continued) 

In general, a SETCFG instruction must be executed in the 
reset routine, in order to properly configure the CPU. The 
options should be combined, and executed in a single in- 
struction. For example, to declare vectored interrupts, a 
Floating Point unit installed, and full CPU clock rate, execute 
a SETCFG [F, I] instruction. To declare non-vectored inter- 
rupts, no FPU, and full CPU clock rate, execute a 
SETCFG ( ] instruction. 



FIGURE 3-3. Power-On Reset Requirements 


-[_rL_ru _ L s jn_TL 

& 64 CLOCK 
CYCLES 

—55 

TL/EE/9424-10 

FIGURE 3-4. General Reset Timing 
3.4 BUS CYCLES 

The CPU will perform a bus cycle for one of the following 
reasons: 

1) To write or read data, to or from memory or peripheral 
devices. Peripheral input and output are memory- 
mapped in the Series 32000 family. 

2) To fetch instructions into the eight-byte instruction 
queue. This happens whenever the bus would otherwise 
be idle and the queue is not already full. 

3) To acknowledge an interrupt and allow external circuitry 
to provide a vector number, or to acknowledge comple- 
tion of an interrupt service routine. 

4) To transfer information to or from a Slave Processor. 

In terms of bus timing, cases 1 through 3 above are identi- 
cal. For timing specifications, see Section 4. The only exter- 
nal difference between them is the four-bit code placed on 
the Bus Status pins (ST0-ST3). Slave Processor cycles dif- 
fer in that separate control signals are applied (Section 
3.4.7). 

3.4.1 Bus Status 

The NS32CG16 CPU presents four bits of Bus Status infor- 
mation on pins ST0-ST3. The various combinations on 
these pins indicate why the CPU is performing a bus cycle, 
or, if it is idle on the bus, then why it is idle. 

The Bus Status pins are interpreted as a four-bit value, with 
STO the least significant bit. Their values decode as follows: 

0000 — The bus is idle because the CPU does not need 

to perform a bus access. 

0001 — The bus is idle because the CPU is executing 

the WAIT instruction. 

0010 — (Reserved for future use.) 




001 1 — The bus is idle because the CPU is waiting for a 
Slave Processor to complete an instruction. 
0100 — Interrupt Acknowledge, Master. 

The CPU is performing a Read cycle to ac- 
knowledge an interrupt request. See Section 
3.4.6. 

0101 — Interrupt Acknowledge, Cascaded. 

The CPU is reading an interrupt vector to ac- 
knowledge a maskable interrupt request from a 
Cascaded Interrupt Control Unit. 

0110 — End of Interrupt, Master. 

The CPU is performing a Read cycle to indicate 
that it is executing a Return from Interrupt 
(RETI) instruction at the completion of an inter- 
rupt’s service procedure. 

0111 — End of Interrupt, Cascaded. 

The CPU is performing a read cycle from a Cas- 
caded Interrupt Control Unit to indicate that it is 
executing a Return from Interrupt (RETI) in- 
struction at the completion of an interrupt’s 
service procedure. 

1000 — Sequential Instruction Fetch. 

The CPU is reading the next sequential word 
from the instruction stream into the Instruction 
Queue. It will do so whenever the bus would 
otherwise be idle and the queue is not already 
full. 

1001 — Non-Sequential Instruction Fetch. 

The CPU is performing the first fetch of instruc- 
tion code after the Instruction Queue is purged. 
This will occur as a result of any jump or branch, 
any interrupt or trap, or execution of certain in- 
structions. 

1010 — Data Transfer. 

The CPU is reading or writing an operand of an 
instruction. 

1011 — Read RMW Operand. 

The CPU is reading an operand which will sub- 
sequently be modified and rewritten. The write 
cycle of RMW will have a “write” status. 

1 1 00 — Read for Effective Address Calculation. 

The CPU is reading information from memory in 
order to determine the Effective Address of an 
operand. This will occur whenever an instruc- 
tion uses the Memory Relative or External ad- 
dressing mode. 

1 1 01 — T ransfer Slave Processor Operand. 

The CPU is either transferring an instruction op- 
erand to or from a Slave Processor, or it is issu- 
ing the Operation Word of a Slave Processor 
instruction. See Section 3.9.1. 

1 1 1 0 — Read Slave Processor Status. 

The CPU is reading a Status Word from a Slave 
Processor after the Slave Processor has sig- 
nalled completion of an instruction. 

1111 — Broadcast Slave ID. 

The CPU is initiating the execution of a Slave 
Processor instruction by transferring the first 
byte of the instruction, which represents the 
slave processor indentification. 


E3 
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3.0 Functional Description (Continued) 

3.4.2 Basic Read and Write Cycles 

The sequence of events occurring during a CPU access to 
either memory or peripheral device is shown in Figure 3-6 
for a read cycle, and Figure 3-7 for a write cycle. 

The cases shown assume that the selected memory or pe- 
ripheral device is capable of communicating with the CPU at 
full speed. If not, then cycle extension may be requested 
through CWAlT and/or WAIT1-2. 

A full-speed bus cycle is performed in four cycles of the 
CTTL clock signal, labeled T1 through T4. Clock cycles not 
associated with a bus cycle are designated Ti (for "Idle”). 


During TI, the CPU applies an address on pins ADO -ADI 5 
and A16-A23. It also provides a low-going pulse on the 
ADS pin, which serves the dual purpose of informing exter- 
nal circuitry that a bus cycle is starting and of providing con- 
trol to an external latch for demultiplexing Address bits 0- 
15 from the ADO- ADI 5 pi ns. Se e Figure 3-5. During this 
time also the statu s sign als DDIN, indicating the direction of 
the transfer, and HBE, indicating whether the high byte 
(ADS -ADI 5) is to be referenced, become valid. 

During T2 the CPU switches the Data Bus, AD0-AD15, to 
either accept or present data. Note that the signals A16- 
A23 remain valid, and need not be latched. 
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3.0 Functional Description (Continued) 

At this time the signals TSO (Timing State Output), DBE 
(Data Buffer Enable) and either RD (Read Strobe) or WR 
(Write Strobe) will also be activated. 

The T3 state provides for access time requirements, and it 
occurs at least once in a bus cyc le. At the e nd of T2, on the 
rising edge of CTTL, the CWAIT and WAIT1-2 signals are 
sampled to determine whether the bus cycle will be extend- 
ed. See Section 3.4.3. 

If the CPU is performing a read cycle, the data bus 
(AD0-AD1 5) is sampled at the beginning of T4 on the rising 
edge of CTTL. Data must, however, be held a little longer to 
meet the data hold time requirements. The RD signal is 
guaranteed not to go inactive before this time, so its rising 
edge can be safely used to disable the device providing the 
input data. 

The T4 state finishes the b us cycle. At the beginning of T4, 
the RD or WR, and TSO signals go inactive, and on the 
falling edge of CTTL, DBE goes inactive, having provided for 
necessary data hold times. Data during Write cycles re- 
mains valid from the CPU throughout T4. Note that the Bus 
Status lines (ST0-ST3) change at the beginning of T4, an- 
ticipating the following bus cycle (if any). 

3.4.3 Cycle Extension 

To allow sufficient access time for any speed of memory or 
peripheral device, the NS32CG16 provides for extension of 
a bus cycle. Any type of bus cycle except a Slave Processor 
cycle can be extended. 

In Figures 3-6 and 3-7, note that during T3 all bus control 
signals from the CPU are flat. Therefore, a bus cycle can be 
cleanly extended by causing the T3 state to be repeated. 
This is the purpose of the WAIT1-2 and CWAIT input sig- 
nals. 

At the end of state T2, on the rising edge of CTTL, WAIT1- 
2 and CWAIT are sampled. 

If any of these signals are active, the bus cycle will be ex- 
tended by at least one clock cycle. Thus, one or more addi- 
tional T3 state (also called wait state) will be inserted after 
the next T-State. Any combination of the above signals can 
be activated at one time. However, the WAIT1-2 inputs are 
only sampled by the CPU at the end of state T2. They are 
ignored at all other times. 

The WAIT1-2 inputs are binary weighted, and can be used 
to insert up to 3 wait states, according to the following table. 




Number of 

Walt States 

WAIT2 

WAIT1 

HIGH 

HIGH 

0 

HIGH 

LOW 

1 

LOW 

HIGH 

2 

LOW 

LOW 

3 


CWAIT causes wait states to be inserted continuously as 
long as it is sampled active. It is normally used when the 
number of wait states to be inserted in the CPU bus cycle is 
not known in advance. 

The following sequence shows the CPU response to the 
WAIT1 -2 and CWAIT inputs. 

1 . Start bus cycle. 

2. Sample WAIT1-2 and CWAIT at the end of state T2. 

3. If the WAIT1-2 inputs are both inactive, then go to step 
6. 


4. Insert the number of wait states selected by WAIT1-2. 

5. Sample CWAIT again. 

6. If CWAIT is not active, then go to step 8. 

7. Insert one wait state and then go to step 5. 

8. Complete bus cycle. 

Figure 3-8 shows a bus cycle extended by three wait states, 
two of which are due to WAIT2, and one is due to CWAIT. 


3.4.4 Data Access Sequences 

The 24-bit address provided by the NS32CG16 is a byte 
address; that is, it uniquely identifies one of up to 
1 6,777,21 6 eight-bit memory locations. An important feature 
of the NS32CG16 is that the presence of a 16-bit data bus 
imposes no restrictions on data alignment; any data item, 
regardless of size, may be placed starting at any memory 
address. The NS3 2CG1 6 provides a special control signal, 
High Byte Enable (HBE), which facilitates individual byte ad- 
dressing on a 16-bit bus. 

Memory is organized as two eight-bit banks, each bank re- 
ceiving the word address (A1 -A23) in parallel. One bank, 
connected to Data Bus pins AD0-AD7, is enabled to re- 
spond to even byte addresses; i.e., when the least signifi- 
cant address bit (AO) is low. The other ban k, co nnected to 
Data Bus pins AD8- ADI 5, is enabled when HBE is low. See 
Figure 3-9. 


A1-A23 



TL/EE/9424-15 

FIGURE 3-9. Memory Interface 

Any bus cycle falls into one of three categories: Even Byte 
Access, Odd Byte Access, and Even Word Access. All ac- 
cesses to any data type are made up of sequ ences of these 
cycles. Table 3-2 gives the state of AO and HBE for each 
category. 


TABLE 3-2. Bus Cycle Categories 


Category 

HBE 

AO 

Even Byte 

1 

0 

Odd Byte 

0 

1 

Even Word 

0 

0 
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3.0 Functional Description (Continued) 

Accesses of operands requiring more than one bus cycle 
are performed sequentially, with no idle T-States separating 
them. The number of bus cycles required to transfer an op- 
erand depends on its size and its alignment (i.e., whether it 
starts on an even byte address or an odd byte address). 
Table 3-3 lists the bus c ycle performed for each situation. 
For the timing of AO and HBE, see Section 3.4.2. 

3.4.4.1 Bit Accesses 

The Bit Instructions perform byte accesses to the byte con- 
taining the designated bit. The Test and Set Bit instruction 
(SBIT), for example, reads a byte, alters it, and rewrites it, 
having changed the contents of one bit. 

3.4.4.2 Bit Field Accesses 

An access to a Bit Field in memory always generates a Dou- 
ble-Word transfer at the address containing the least signifi- 
cant bit of the field. The Double Word is read by an Extract 
instruction; an Insert instruction reads a Double Word, modi- 
fies it, and rewrites it. 

3.4.4.3 Extending Multiply Accesses 

The Multiply Extended Integer (MEI) instruction will return a 
result which is twice the size in bytes of the operand it 
reads. If the multiplicand is in memory, the most-significant 
half of the result is written first (at the higher address), then 
the least-significant half. 

3.4.5 Instruction Fetches 

Instructions for the NS32CG16 CPU are “prefetched”; that 
is, they are input before being needed into the next available 
entry of the eight-byte Instruction Queue. The CPU performs 


two types of Instruction Fetch cycles: Sequential and Non- 
sequential. These can be distinguished from each other by 
their differing status combinations on pins ST0-ST3 (Sec- 
tion 3.4.1). 

A Sequential Fetch will be performed by the CPU whenever 
the Data Bus would otherwise be idle and the Instruction 
Queue is not currently full. Sequential Fetches are always 
Even Word Read cycles (Table 3-2). 

A Non-Sequential Fetch occurs as a result of any break in 
the normally sequential flow of a program. Any jump or 
branch instruction, a trap or an interrupt will cause the next 
Instruction Fetch cycle to be Non-Sequential. In addition, 
certain instructions flush the instruction queue, causing the 
next instruction fetch to display Non-Sequential status. Only 
the first bus cycle after a break displays Non-Sequential 
status, and that cycle is either an Even Word Read or an 
Odd Byte Read, depending on whether the destination ad- 
dress is even or odd. 

3.4.6 Interrupt Control Cycles 

Activating the InT or NMI pin on the CPU will initiate one or 
more bus cycles whose purpose is interrupt control rather 
than the transfer of instructions or data. Execution of the 
Return from Interrupt instruction (RETI) will also cause Inter- 
rupt Control bus cycles. These differ from instruction or data 
transfers only in the status presented on pins ST0-ST3. All 
Interrupt Control cycles are single-byte Read cycles. 

Table 3-4 shows the Interrupt Control sequences associat- 
ed with each interrupt and with the return from its service 
routine. For full details of the NS32CG16 interrupt structure, 
see Section 3.8. 
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3.0 Functional Description (Continued) 


Cycle Type 


TABLE 3-3. Access Sequences 
Address HU AO 


High Bus 


Odd Byte 
Even Byte 


A. Odd Word Access Sequence 


Byte 0 
Don’t Care 


BYTE 0 | 

Don’t Care 
Byte 1 


Even Word 
Even Word 


B. Even Double-Word Access Sequence 


Odd Byte 
Even Word 
Even Byte 


C. Odd Double-Word Access Sequence 

BYTE 3 BYTE 2 BYTE 1 

\ 0 1 Byte 0 

\+1 0 0 Byte 2 

^+3 1 0 Don’t Care 


Don’t Care 
Byte 1 
Byte 3 


D. Even Quad-Word Access Sequence 


1 Even Word A 

2 Even Word A + 2 

Other bus cycles (instruction prefetch or slave) can occur here. 

3 Even Word A + 4 

4 Even Word A + 6 


E. Odd Quad-Word Access Sequence 

BYTE 7 I BYTE 6 i BYTES I BYTE 4 I BYTE 3 i BYTE 2 | BYTE 1 


1 Odd Byte A 0 

2 Even Word A + 1 0 

3 Even Byte A + 3 1 

Other bus cycles (instruction prefetch or slave) can occur here. 

4 Odd Byte A+4 0 

5 Even Word A+5 0 

6 Even Byte A+7 1 


| BYTE 1 ! 

BYTE 0 

Byte 1 

ByteO 

Byte 3 

Byte 2 

Byte 5 

Byte 4 

Byte 7 

Byte 6 

BYTE 1 

BYTEO 


1 

Byte 0 

Don’t Care 

0 

Byte 2 

Byte 1 

0 

Don’t Care 

Byte 3 

1 

Byte 4 

Don’t Care 

0 

Byte 6 

Byte 5 

0 

Don’t Care 

Byte 7 
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3.0 Functional Description (Continued) 

TABLE 3-4. Interrupt Sequences 

Cycle Status Address DDIN HBE AO High Bus Low Bus 

A. Non-Maskable Interrupt Control Sequence 

Interrupt Acknowledge 

1 0100 FFFFOO 16 0 1 0 Don't Care Don’t Care 

Interrupt Return 

None: Performed through Return from Trap (RETT) instruction. 

B. Non- Vectored Interrupt Control Sequence 

Interrupt Acknowledge 

1 0100 FFFEOOie 0 1 0 Don’t Care Don’t Care 

Interrupt Return 

None: Performed through Return from Trap (RETT) instruction. 


C. Vectored Interrupt Sequence: Non-Cascaded 

Interrupt Acknowledge 

1 0100 FFFEOOie 0 1 0 Don’t Care Vector: 

Range: 0-127 

Interrupt Return 

1 0110 FFFE00 16 0 1 0 Don’t Care Vector: Same as 

in Previous Int. 

D. Vectored Interrupt Sequence: Cascaded Ack. Cycle 

Interrupt Acknowledge 

1 0100 FFFE00 16 0 1 0 Don’t Care Cascade Index: 

range -16to -1 

(The CPU here uses the Cascade Index to find the Cascade Address.) 

2 0101 Cascade 0 lor 0 or Vector, range 0-255; on appropriate 

Address 0* 1* half of Data Bus for even/odd address 

Interrupt Return 

1 0110 FFFEOO 16 0 1 0 Don’tCare Cascade Index: 

same as in 
previous Int. 

Ack. Cycle 

(The CPU here uses the Cascade Index to find the Cascade Address.) 

2 0111 Cascade 0 1 or 0 or Don't Care Don’t Care 

Address 0* 1 * 

* If the Cascaded ICU Address is Even (A0 is low), then the CPU applies HBE high and reads the vector number from bits 0-7 of the Data Bus. 

If the address is Odd (AO Is high), then the CPU applies Hb£ low and reads the vector number from bits 8-15 of the Data Bus. The vector number 
may be in the range 0-255. 
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3.0 Functional Description (Continued) 

3.4.7 Slave Processor Communication 

The SPC pin is used as the data strobe for Slave Processor 
transfers. In a Slave Processor bus cycle, data is transferred 
on the Data Bus (AD0-AD15), and the status lines ST0- 
ST3 are monitored by the Slave Processor i n orde r to deter- 
mine the type of transfer being performed. SPC is bidirec- 
tional, but is driven by the CPU during all Slave Processor 
bus cycles. See Section 3.8 for full protocol sequences. 


NS32CQ16 

CPU 


SLAVE 

PROCESSOR 


TL/EE/9424-16 

FIGURE 3-10. Slave Processor Connections 







‘Note: CPU samples Data Bus here. 

FIGURE 3-11. Slave Processor Read Cycle 
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3.0 Functional Description (Continued) 

3.4.7. 1 Slave Processor Bus Cycles 

A Slave Processor bus cycle always takes exactly two clock 
cycles, labeled T1 and T 4 (see Figures 3-11 and 3-12). 
During a Read cycle SPC is active from the beginning of T1 
to the beginning of T4, and the data is sampled at the end of 
T1. The Cycle Status pins lead the cycle by one c lock peri- 
od, and are sampled at the leading edge of SP C. Du ring a 
Write cycl e, th e CPU applies data and activates SPC at T1 , 
removing SPC at T4. The S lave Processor latches status on 
the leading edge of SPC and latches data on the trailing 
edge. 

The CPU does not pulse the Address Strobe (ADS), and no 
bus signals are generated. The direction of a transfer is de- 
termined by the sequence (“protocol”) established by the 
instruction under execution; but the CPU indicates the direc- 
tion on the DDIN pin for hardware debugging purposes. 


3.4.7.2 Slave Operand Transfer Sequences 

A Slave Processor operand is transferred in one or more 
Slave bus cycles. A Byte operand is transferred on the 
least-significant byte of the Data Bus (AD0-AD7), and a 
Word operand is transferred on the entire bus. A Double 
Word is transferred in a consecutive pair of bus cycles, 
least-significant word first. A Quad Word is transferred in 
two pairs of Slave cycles, with other bus cycles possibly 
occurring between them. The word order is from least-signif- 
icant word to most-significant. 

3.5 BUS ACCESS CONTROL 

The NS32CG16 CPU has the capability of relinquishing its 
access to the bus upon request from a DMA contr oller or 
another CPU. This capability is implemented on the HOLD 
(Hold Request) and HLDA (Hold Acknowledge) pins. By as- 




*Note: Slave Processor samples Data Bus here. 

FIGURE 3-12. Slave Processor Write Cycle 


2-129 


NS32CG 1 6- 1 0/NS32CG 1 6- 1 5 



NS32CG16-10/NS32CG16-15 


3.0 Functional Description (Continued) 

sorting HOLD low, an external device requests access to 
the bus. On receipt of HLDA from the CPU, the device may 
perform bus cycles, as the CPU at this point has set AD0- 
AD15, A16-A23 and HB E to t he TRI-STATE® condition and 
has switched ADS and DDIN to the input mode. The CPU 
now monitors ADS and DDIN from the external device to 
generate the relevant strobe signals (i.e., TSO, DBE, RD or 
WR). To re turn control of the bus to the CPU, the device 
sets HOLD inactiv e, and the CPU acknowledges return of 
the bus by setting HLDA inactive. 

How quickly the CPU releases the bus dep ends on whether 
it is idle on the bus at the time the HOLD request is made, 
as the CPU must always complete the current bus cycle. 
Figure 3-13 shows the timing sequence when the CPU is 


idle. In this case, the CPU grants the bus during the immedi- 
ately following clock cycle. Figure 3-14 shows the sequence 
if the CPU is using the bus at the time that the HOLD re- 
quest is made. If the request is made during or before the 
clock cycle shown (two clock cycles before T4), the CPU 
will release the bus during the clock cycle following T4. If 
the request occurs closer to T4, the CPU may already have 
decided to initiate another bus cycle. In that case it will not 
grant the bus until after the next T4 state. Note that this 
situation will also occur if the CPU is idle on the bus but has 
initiated a bus cycle internally. 

Note 1: During DMA cycles the WAIT1-2 signals should be kept inactive, 
unless they a re also monitored by the DMA controller. If wait states 
are required, CWAlT should be used. 

Note 2: The logic value of the status pins, ST0-ST3, is undefined during 
DMA activity. 


jSjjSS 


AFFECTED SIGNALS 


ST0-ST3 PREVIOUS 


FIGURE 3-13. HOLD Timing, Bus Initially Idle 
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3.0 Functional Description (Continued) 


CTTL 


HOLD 


HLDA 


ADS 


DDIN 


HBE 


ADO- AD 15 


A16-A23 


ST0-ST3 



3.6 INSTRUCTION STATUS 

In addition to the four bits of Bus Cycle status (ST0-ST3), 
the NS32CG16 CPU also presents Instruction Status infor- 
mation on three separate pins. These pins differ from STO- 
ST3 in that they are synchronous to the CPU’s internal in- 
struction execution section rather than to its bus interface 
section. 


PFS (Program Flow Status) is pulsed low as each instruction 
begins execution. It is intended for debugging purposes. 
U/S originates from the U bit of the Processor Status Regis- 
ter, and indicates whether the CPU is currently running in 
User or Supervisor mode. Although it is not synchronous to 
bus cycles, there are guarantees on its validity during any 
given bus cycle. See the Timing Specifications in Section 4. 
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3.0 Functional Description (Continued) 

ILO (Interlocked Operation) is activated during an SBITI (Set 
Bit, Interlocked) or CBITI (Clear Bit, Interlocked) instruction. 
It is made available to external bus arbitration circuitry in 
order to allow these instructions to implement the sema- 
phore primitive operations for multi-processor communica- 
tion and resource sharing. ILO is guaranteed to be active 
during the operand accesses performed by the interlocked 
instructions. 

Note: The acknowledge of HOLD is on a cycle by cycle basis. Therefore, it 
is possible to have HLDA active when an interlocked operation is in 
progress. In this case, ILO remains low and the interlocked instruction 
continues only after HOLD is de-asserted. 

3.7 EXCEPTION PROCESSING 

Exceptions are special events that alter the sequence of 
instruction execution. The CPU recognizes two basic types 
of exceptions: interrupts and traps. 

An interrupt o ccur s in response to an event signalled by 
activating the NMI or TnT input signals. Interrupts are typi- 
cally requested by peripheral devices that require the CPU’s 
attention. 

Traps occur as a result either of exceptional conditions 
(e.g., attempted division by zero) or of specific instructions 
whose purpose is to cause a trap to occur (e.g., supervisor 
call instruction). 

When an exception is recognized, the CPU saves the PC, 
PSR and the MOD register contents on the interrupt stack 
and then it transfers control to an exception service proce- 
dure. 

Details on the operations performed in the various cases by 
the CPU to enter and exit the exception service procedure 
are given in the following sections. 


It is to be noted that the reset operation is not treated here 
as an exception. Even though, like any exception, it alters 
the instruction execution sequence. 

The reason being that the CPU handles reset in a signifi- 
cantly different way than it does for exceptions. 

Refer to Section for details on the reset operation. 

3.7.1 Exception Acknowledge Sequence 

When an exception is recognized, the CPU goes through 

three major steps: 

1) Adjustment of Registers. 

Depending on the source of the exception, the CPU may 
restore and/or adjust the contents of the Program Coun- 
ter (PC), the Processor Status Register (PSR) and the 
currently-selected Stack Pointer (SP). A copy of the PSR 
is made, and the PSR is then set to reflect Supervisor 
Mode and selection of the Interrupt Stack. 

2) Vector Acquisition. 

A Vector is either obtained from the Data Bus or is sup- 
plied by default. 

3) Service Call. 

The Vector is used as an index into the Interrupt Dis- 
patch Table, whose base address is taken from the CPU 
Interrupt Base (INTBASE) Register. See Figure 3-15. A 
32-bit External Procedure Descriptor is read from the ta- 
ble entry, and an External Procedure Call is performed 
using it. The MOD Register (1 6 bits) and Program Coun- 
ter (32 bits) are pushed on the Interrupt Stack. 
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3.0 Functional Description (Continued) 

3.7.2 Returning from an Exception Service Procedure 
To return control to an interrupted program, one of two in- 
structions can be used: RETT (Return from Trap) and RETI 
(Return from Interrupt). 

RETT is used to return from any trap or a non-maskable 
interrupt service procedure. Since some traps are often 
used deliberately as a call mechanism for supervisor mode 
procedures, RETT can also adjust the Stack Pointer (SP) to 
discard a specified number of bytes from the original stack 
as surplus parameter space. 

RETI is used to return from a maskable interrupt service 
procedure. A difference of RETT, RETI also informs any 
external interrupt control units that interrupt service has 
completed. Since interrupts are generally asynchronous ex- 
ternal events, RETI does not discard parameters from the 
stack. 

Both of the above instructions always restore the PSR, 
MOD, PC and SB registers to their previous contents. 


3.7.3 Maskable Interrupts 

The INT pin is a level-sensitive input. A continuous low level 
is allowed for generating multiple interrupt requests. The in- 
put is maskable, and is therefore enabled to generate inter- 
rupt requests only while the Processor Status Register I bit 
is set. The I bit is automatically cleared during service of an 
INT or NMI request, and is restored to its original setting 
upon return from the interrupt service routine via the RETT 
or RETI instruction. 

The TnT pin may be configured via the SETCFG instruction 
as either Non-Vectored (CFG Register bit 1 = 0) or Vectored 
(bit 1 = 1). 

3.7.3.1 Non-Vectored Mode 

In the Non-Vectored mode, an interrupt request on the TFTT 
pin will cause an Interrupt Acknowledge bus cycle, but the 
CPU will ignore any value read from the bus and use instead 
a default vector of zero. This mode is useful for small sys- 
tems in which hardware interrupt prioritization is unneces- 
sary. 


PROGRAM COUNTER 



INTERRUPT 

STACK 



MODULE TABLE ENTRY 
STATIC BASE POINTER 


LINK BASE POINTER 


PROGRAM BASE POINTER 



STACK SELECTED 
IN NEWLY- 
POPPED PSR. 


FIGURE 3-17. Return from Trap (RETT n) Instruction Flow 
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3.0 Functional Description (Continued) 



INTERRUPT 

CONTROL 

UNIT 


PROGRAM COUNTER 



MODULE TABLE ENTRY 
STATIC BASE POINTER 


LINK BASE POINTER 


PROGRAM BASE POINTER 


FIGURE 3-18. Return from Interrupt (RETI) Instruction Flow 


3.7.3.2 Vectored Mode: Non-Cascaded Case 

In the Vectored mode, the CPU uses an Interrupt Control 
Unit (ICU) to prioritize up to 16 interrupt requests. Upon re- 
ceipt of an interrupt request on the TNT pin, the CPU per- 
forms an “Interrupt Acknowledge, Master" bus cycle read- 
ing a vector value from the low-order byte of the Data Bus. 
This vector is then used as an index into the Dispatch Table 
in order to find the External Procedure Descriptor for the 
proper interrupt service procedure. The service procedure 
eventually returns via the Return from Interrupt (RETI) in- 
struction, which performs an End of Interrupt bus cycle, in- 
forming the ICU that it may re-prioritize any interrupt re- 


quests still pending. The ICU provides the vector number 
again, which the CPU uses to determine whether it needs 
also to inform a Cascaded ICU. 

In a system with only one ICU (16 levels of interrupt), the 
vectors provided must be in the range of 0 through 1 27; that 
is, they must be positive numbers in eight bits. By providing 
a negative vector number, an ICU flags the interrupt source 
as being a Cascaded ICU (see below). 

3.7.3.3 Vectored Mode: Cascaded Case 

In order to allow up to 256 levels of interrupt, provision is 
made both in the CPU and in the NS32202 Interrupt Control 
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3.0 Functional Description (Continued) 

Unit (ICU) to transparently support cascading. Figure 3-20 
shows a typical cascaded configuration. Note that the Inter- 
rupt output from a Cascaded ICU goes to an Interrupt Re- 
quest input of the Master ICU, which is the only ICU which 
drives the CPU TnT pin. 

In a system which uses cascading, two tasks must be per- 
formed upon initialization: 

1) For each Cascaded ICU in the system, the Mater ICU 
must be informed of the line number (0 to 15) on which it 
receives the cascaded requests. 

2) A Cascade Table must be established in memory. The 
Cascade Table is located in a NEGATIVE direction from 
the location indicated by the CPU Interrupt Base (INT- 
BASE) Register. Its entries are 32-bit addresses, pointing 
to the Vector Registers of each of up to 16 Cascaded 
ICUs. 

Figure 3-15 illustrates the position of the Cascade Table. To 
find the Cascade Table entry for a Cascaded ICU, take its 
Master ICU line number (0 to 15) and subtract 16 from it, 
giving an index in the range -16 to -1. Multiply this value 
by 4, and add the resulting negative number to the contents 
of the INTBASE Register. The 32-bit entry at this address 
must be set to the address of the Hardware Vector Register 
of the Cascaded ICU. This is referred to as the “Cascade 
Address.” 

Upon receipt of an interrupt request from a Cascaded ICU, 
the Master ICU interrupts the CPU and provides the neg- 


ative Cascade Table index instead of a (positive) vector 
number. The CPU, seeing the negative value, uses it as an 
index into the Cascade Table and reads the Cascade Ad- 
dress from the referenced entry. Applying this address, the 
CPU performs an "Interrupt Acknowledge, Cascaded” bus 
cycle, reading the final vector value. This vector is interpret- 
ed by the CPU as an unsigned byte, and can therefore be in 
the range of 0 through 255. 

In returning from a Cascaded interrupt, the service proce- 
dure executes the Return from Interrupt (RETI) instruction, 
as it would for any Maskable Interrupt. The CPU performs 
an "End of Interrupt, Master” bus cycle, whereupon the 
Master ICU again provides the negative Cascaded Table 
index. The CPU, seeing a negative value, uses it to find the 
corresponding Cascade Address from the Cascade Table. 
Applying this address, it performs an "End of Interrupt, Cas- 
caded” bus cycle, informing the Cascaded ICU of the com- 
pletion of the service routine. The byte read from the Cas- 
caded ICU is discarded. 

Note: If an interrupt must be masked off, the CPU can do so by setting the 
corresponding bit in the Interrupt Mask Register of the Interrupt Con- 
troller. However, if an interrupt is set pending during the CPU instruc- 
tion that masks off that interrupt, the CPU may still perform an inter- 
rupt acknowledge cycle following that instruction since it might have 
sampled the InT line before the ICU deasserted it. This could cause 
the ICU to provide an invalid vector. To avoid this problem the above 
operation should be performed with the CPU interrupt disabled. 
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3.7.4 Non-Maskable Interrupt 

The Non-Maskable Interru pt is triggered whenever a falling 
edge is detected on the NMI pin. The CPU performs an 
“Interrupt Acknowledge, Master” bus cycle when process- 
ing of this interrupt actually begins. The Interrupt Acknowl- 
edge cycle differs from that provided for Maskable Inter- 
rupts in that the address presented is FFFFOOiq. The vector 
value used for the Non-Maskable Interrupt is taken as 1, 
regardless of the value read from the bus. 

The service procedure returns from the Non-Maskable In- 
terrupt using the Return from Trap (RETT) instruction. No 
special bus cycles occur on return. 

For the full sequence of events in processing the Non- 
Maskable Interrupt, see Section 3.7.7.I. 


3.7.5 Traps 

Traps are processing exceptions that are generated as di- 
rect results of the execution of an instruction. The Return 
Address pushed by any trap except Trap (TRC) is the ad- 
dress of the first byte of the instruction during which the trap 
occurred. Traps do not disable interrupts, as they are not 
associated with external events. Traps recognized by 
NS32CG16 CPU are: 

Trap (SLAVE): An exceptional condition was detected by 
the Floating Point Unit during the execution of a Slave In- 
struction. This trap is requested via the Status Word re- 
turned as part of the Slave Processor Protocol (Section 
3.8.1). 
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3.0 Functional Description (Continued) 

Trap (ILL): Illegal operation. A privileged operation was at- 
tempted while the CPU was in User Mode (PSR bit U = 1). 
Trap (SVC): The Supervisor Call (SVC) instruction was exe- 
cuted. 

Trap (DVZ): An attempt was made to divide an integer by 
zero. (The SLAVE trap is used for Floating Point division by 
zero.) 

Trap (FLG): The FLAG instruction detected a “1” in the 
CPU PSR F bit. 

Trap (BPT): The Breakpoint (BPT) instruction was execut- 
ed. 

Trap (TRC): The instruction just completed is being traced. 
See Section 3.7.6. 

Trap (UND): An undefined opcode was encountered by the 
CPU. 

3.7.6 Instruction Tracing 

Instruction tracing is a feature that can be used during de- 
bugging to single-step through selected portions of a pro- 
gram. Tracing is enabled by setting the T-bit in the PSR 
Register. When enabled, the CPU generates a Trace Trap 
(TRC) after the execution of each instruction. 

At the beginning of each instruction, the T bit is copied into 
the PSR P (Trace “Pending") bit. If the P bit is set at the end 
of an instruction, then the Trace Trap is activated. If any 
other trap or interrupt request is made during a traced in- 
struction, its entire service procedure is allowed to complete 
before the Trace Trap occurs. Each interrupt and trap se- 
quence handles the P bit for proper tracing, guaranteeing 
only one Trace Trap per instruction, and guaranteeing that 
the Return Address pushed during a Trace Trap is always 
the address of the next instruction to be traced. 

Due to the fact that some instructions can clear the T and P 
bits in the PSR, in some cases a Trace Trap may not occur 
at the end of the instruction. This happens when one of the 
privileged instructions BICPSRW or LPRW PSR is executed. 
In other cases, it is still possible to guarantee that a Trace 
Trap occurs at the end of the instruction, provided that spe- 
cial care is taken before returning from the Trace Trap Serv- 
ice Procedure. In case a BICPSRB instruction has been ex- 
ecuted, the service procedure should make sure that the T 
bit in the PSR copy saved on the Interrupt Stack is set be- 
fore executing the RETT instruction to return to the program 
begin traced. If the RETT or RETI instructions have to be 
traced, the Trace Trap Service Procedure should set the P 
and T bits in the PSR copy on the Interrupt Stack that is 
going to be restored in the execution of such instructions. 
While debugging the NS32CG16 instructions which have in- 
terior loops (BBOR, BBXOR, BBAND, BBFOR, EXTBLT, 
MOVMP, SBITPS, TBITS), special care must be taken with 
the single-step trap. If an interrupt occurs during a single- 
step of one of the graphics instructions, the interrupt will be 
serviced. Upon return from the interrupt service routine, the 
new NS32CG16 instruction will not be re-entered, due to a 
single-step trap. Both the NMI and INT interrupts will cause 
this behavior. Another single-step operation (S command in 
DBG16/MONCG) will resume from where the instruction 
was interrupted. There are no side effects from this early 
termination, and the instruction will complete normally. 

For all other Series 32000 instructions, a single-step opera- 
tion will complete the entire instruction before trapping back 


to the debugger. On the instructions mentioned above, sev- 
eral single-step commands may be required to complete the 
instruction, ONLY when interrupts are occurring. 

There are some methods to give the appearance of single- 
stepping for these NS32CG16 instructions. 

1. MON16/MONCG monitors the return from single-step 
trap vector, PC value. If the PC has not changed since the 
last single-step command was issued, the single-step oper- 
ation is repeated. It is also advisable to ensure that one of 
the NS32CG16 instructions is being single-stepped, by in- 
specting the first byte of the address pointed to by the PC 
register. If it is OxOE, then the instruction is an NS32CG16- 
specific instruction. 

2. A breakpoint following the instruction would also trap af- 
ter the instruction had completed. 

Note: If instruction tracing is enabled while the WAIT instruction is executed, 
the Trap (TRC) occurs after the next interrupt, when the interrupt 
service procedure has returned. 

3.7.7 Priority Among Exceptions 

The NS32CG16 CPU internally prioritizes simultaneous in- 
terrupt and trap requests as follows: 

1 ) T raps other than T race (Highest priority) 

2) Non-Maskable Interrupt 

3) Maskable Interrupts 

4) T race T rap (Lowest priority) 

3.7.8 Exception Acknowledge Sequences: Detail Flow 

For purposes of the following detailed discussion of inter- 
rupt and trap acknowledge sequences, a single sequence 
called “Service” is defined in Figure 3-21. Upon detecting 
any interrupt request or trap condition, the CPU first per- 
forms a sequence dependent upon the type of interrupt or 
trap. This sequence will include pushing the Processor 
Status Register and establishing a Vector and a Return Ad- 
dress. The CPU then performs the Service sequence. 

3.7.8.1 Maskable/Non-Maskable Interrupt Sequence 

This sequence is performed by the CPU when the NMI pin 
receives a falling edge, or the INT pin becomes active with 
the PSR I bit set. The interrupt sequence begins either at 
the next instruction boundary or, in the case of the String 
instructions, or Graphics instructions which have interior 
loops (BBOR, BBXOR, BBAND, BBFOR, EXTBLT, MOVMP, 
SBITPS, TBITS), at the next interruptible point during its ex- 
ecution. The graphics instructions are interruptible. 

1 . If a String instruction was interrupted and not yet com- 
pleted: 

a. Clear the Processor Status Register P bit. 

b. Set “Return Address” to the address of the first byte 
of the interrupted instruction. 

Otherwise, set “Return Address” to the address of the 
next instruction. 

2. Copy the Processor Status Register (PSR) into a tempo- 
rary register, then clear PSR bits S, U, T, P and I. 

3. If the interrupt is Non-Maskable: 

a. Read a byte from address FFFFOO-ie, applying Status 
Code 0100 (Interrupt Acknowledge, Master: Section 
3.4.1). Discard the byte read. 

b. Set “Vector” to 1. 

c. Go to Step 8. 
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3.0 Functional Description (Continued) 

4. If the interrupt is Non-Vectored: 

a. Read a byte from address FFFEOO 16 . applying Status 
Code 0100 (Interrupt Acknowledge, Master: Section 
3.4.1). Discard the byte read. 

b. Set “Vector” to 0. 

c. Go to Step 8. 

5. Here the interrupt is Vectored. Read “Byte” from ad- 
dress FFFEOO 16 . applying Status Code 0100 (Interrupt 
Acknowledge, Master: Section 3.4.1). 

6. If "Byte” ^ 0, then set “Vector” to “Byte" and go to 
Step 8. 

7. If “Byte" is in the range - 1 6 through - 1 , then the inter- 
rupt source is Cascaded. (More negative values are re- 
served for future use.) Perform the following: 

a. Read the 32-bit Cascade Address from memory. The 
address is calculated as INTBASE + 4* Byte. 

b. Read "Vector”, applying the Cascade Address just 
read and Status Code 0101 (Interrupt Acknowledge, 
Cascaded: Section 3.4.1). 

8. Push the PSR copy (from Step 2) onto the Interrupt 
Stack as a 16-bit value. 

9. Perform Service (Vector, Return Address), Figure 3-21. 
Service (Vector, Return Address): 

1) Read the 32-bit External Procedure Descriptor from the 

Interrupt Dispatch Table: address is 

Vector*4+ INTBASE Register contents. 

2) Move the Module field of the Descriptor into the tempo- 
rary MOD Register. 

3) Read the Program Base pointer from memory address 
MOD + 8, and add to it the Offset field from the Descrip- 
tor, placing the result in the Program Counter. 

4) Read the new Static Base pointer from the memory ad- 
dress contained in MOD, placing it into the SB Register. 

5) Flush Queue: Non-sequentially fetch first instruction of 
Interrupt Routine. 

6) Push MOD Register onto the Interrupt Stack as a 16-bit 
value. (The PSR has already been pushed as a 16-bit 
value.) 

7) Push the Return Address onto the Interrupt Stack as a 
32-bit quantity. 

8) Copy temporary MOD Register to MOD Register. 

FIGURE 3-21. Service Sequence 

Invoked during All Interrupt/Trap Sequences 

3.7.8.2 Trap Sequence: T raps Other Than Trace 

1) Restore the currently selected Stack Pointer and the 
Processor Status Register to their original values at the 
start of the trapped instruction. 

2) Set "Vector” to the value corresponding to the trap type. 

SLAVE: Vector =3. 

ILL: Vector = 4. 

SVC: Vector =5. 

DVZ: Vector = 6. 

FLG: Vector =7. 

BPT: Vector =8. 

UND: Vector =10. 


3) Copy the Processor Status Register (PSR) into a tempo- 
rary register, then clear PSR bits S, U, P and T. 

4) Push the PSR copy onto the Interrupt Stack as a 16-bit 
value. 

5) Set "Return Address” to the address of the first byte of 
the trapped instruction. 

6) Perform Service (Vector, Return Address), Figure 3-21. 
3.7.8. 3 Trace Trap Sequence 

1) In the Processor Status Register (PSR), clear the P bit. 

2) Copy the PSR into a temporary register, then clear PSR 
bits S, U and T. 

3) Push the PSR copy onto the Interrupt Stack as a 16-bit 
value. 

4) Set “Vector” to 9. 

5) Set “Return Address” to the address of the next instruc- 
tion. 

6) Perform Service (Vector, Return Address), Figure 3-21. 
3.8 SLAVE PROCESSOR INSTRUCTIONS 

The NS32CG16 supports only one group of instructions, the 
floating point instruction set, as being executable by a slave 
processor. The floating point instruction set is validated by 
the F bit in the CFG register. 

If a floating-point instruction is encountered and the F bit in 
the CFG register is not set, a Trap(UND) will result, without 
any slave processor communication attempted by the CPU. 
This allows software emulation in case an external floating 
point unit (FPU) is not used. 

3.8.1 Slave Processor Protocol 

Slave Processor instructions have a three-byte Basic In- 
struction field, consisting of an ID Byte followed by an Oper- 
ation Word. The ID Byte has three functions: 

1) It identifies the instruction as being a Slave Processor 
instruction. 

2) It specifies which Slave Processor will execute it. 

3) It determines the format of the following Operation Word 
of the instruction. 

Upon receiving a Slave Processor instruction, the CPU initi- 
ates the sequence outlined in Figure 3-22. While applying 
Status Code 1111 (Broadcast ID, Section 3.4.1), the CPU 
transfers the ID Byte on the least-significant half of the Data 
Bus (AD0-AD7). All Slave Processors input this byte and 
decode it. The Slave Processor selected by the ID Byte is 
activated, and from this point the CPU is communicating 
only with it. If any other slave protocol was in progress (e.g., 
an aborted Slave instruction), this transfer cancels it. 

The CPU next sends the Operation Word while applying 
Status Code 1101 (Transfer Slave Operand, Section 3.4.1). 
Upon receiving it, the Slave Processor decodes it, and at 
this point both the CPU and the Slave Processor are aware 
of the number of operands to be transferred and their sizes. 
The Operation Word is swapped on the Data Bus; that is, 
bits 0-7 appear on pins AD8-AD15 and bits 8-15 appear 
on pins AD0-AD7. 
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3.0 Functional Description (Continued) 

Using the Addressing Mode fields within the Operation 
Word, the CPU starts fetching operands and issuing them to 
the Slave Processor. To do so, it references any Addressing 
Mode extensions which may be appended to the Slave 
Processor instruction. Since the CPU is solely responsible 
for memory accesses, these extensions are not sent to the 
Slave Processor. The Status Code applied is 1 101 (Transfer 
Slave Processor Operand, Section 3.4.1). 

Status Combinations: 

Send ID (ID): Code 1111 
Xfer Operand (OP): Code 1101 
Read Status (ST): Code 1110 
Status Action 

ID CPU Sends ID Byte. 

OP CPU Sends Operation Word. 

OP CPU Sends Required Operands. 

— Slave Starts Execution. CPU Pre- 

Fetches. 

— Slave Pulses SPC Low. 

ST CPU Reads Status Word. (T rap? Alter 
Flags?) 

OP CPU Reads Results (If Any). 

FIGURE 3-22. Slave Processor Protocol 

After the CPU has issued the last operand, the Slave Proc- 
essor starts the actual execution of the instruction. Upon 
completion, it will signal the CPU by pulsing SPC low. 


Step 

1 

2 

3 

4 

5 

6 

7 


While the Slave Processor is executing the instruction, the 
CPU is free to prefetch instructions into its queue. If it fills 
the queue before the Slave Processor finishes, the CPU will 
wait, applying Status Code 001 1 (Waiting for Slave). 

Upon receiving the pulse on SPC, the CPU uses SPC to 
read a Status Word from the Slave Processor, applying 
Status Code 1110 (Read Slave Status). This word has the 
format shown in Figure 3-23. If the Q bit (“Quit”, Bit 0) is set, 
this indicates that an error was detected by the Slave Proc- 
essor. The CPU will not continue the protocol, but will imme- 
diately trap through the Slave vector in the Interrupt Table. 
Certain Slave Processor instructions cause CPU PSR bits to 
be loaded from the Status Word. 

The last step in the protocol is for the CPU to read a result, 
if any, and transfer it to the destination. The Read cycles 
from the Slave Processor are performed by the CPU while 
applying Status Code 1101 (Transfer Slave Operand). 

3.8.2 Floating Point Instructions 

Table 3-5 gives the protocols followed for each Floating 
Point instruction. The instructions are referenced by their 
mnemonics. For the bit encodings of each instruction, see 
Appendix A. 


TABLE 3-5. Floating Point Instruction Protocols 



Operand 1 

Operand 2 

Operand 1 

Operand 2 

Returned Value 

PSR Bits 

Mnemonic 

Class 

Class 

Issued 

Issued 

Type and Dest. 

Affected 

ADDf 

read.f 

rmw.f 

f 

f 

f to Op. 2 

none 

SUBf 

read.f 

rmw.f 

f 

f 

f to Op. 2 

none 

MULf 

read.f 

rmw.f 

f 

f 

f to Op. 2 

none 

DIVf 

read.f 

rmw.f 

f 

f 

f to Op. 2 

none 

MOVf 

read.f 

write.f 

f 

N/A 

f to Op. 2 

none 

ABSf 

read.f 

write.f 

f 

N/A 

f to Op. 2 

none 

NEGf 

read.f 

write.f 

f 

N/A 

f to Op. 2 

none 

CMPf 

read.f 

read.f 

f 

f 

N/A 

N,Z,L 

FLOORfi 

read.f 

write.i 

f 

N/A 

i to Op. 2 

none 

TRUNCfi 

read.f 

write.i 

f 

N/A 

i to Op. 2 

none 

ROUNDfi 

read.f 

write.i 

f 

N/A 

i to Op. 2 

none 

MOVFL 

read.F 

write.L 

F 

N/A 

L to Op. 2 

none 

MOVLF 

read.L 

write.F 

L 

N/A 

F to Op. 2 

none 

MOVif 

read.i 

write.f 

i 

N/A 

f to Op. 2 

none 

LFSR 

read.D 

N/A 

D 

N/A 

N/A 

none 

SFSR 

N/A 

write. D 

N/A 

N/A 

D to Op. 2 

none 

POLYf 

read.f 

read.f 

f 

f 

f to F0 

none 

DOTf 

read.f 

read.f 

f 

f 

f to F0 

none 

SCALBf 

read.f 

rmw.f 

f 

f 

f to Op. 2 

none 

LOGBf 

read.f 

write.f 

f 

N/A 

f to Op. 2 

none 


Note: 

D = Double Word 

i = integer size (B,W,D) specified in mnemonic, 
f = Floating Point type (F,L) specified in mnemonic. 
N/A = Not Applicable to this instruction. 
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3.0 Functional Description (Continued) 

The Operand class columns give the Access Class for each 
general operand, defining how the addressing modes are 
interpreted (see Series 32000 Instruction Set Reference 
Manual). 

The Operand Issued columns show the sizes of the oper- 
ands issued to the Floating Point Unit by the CPU. “D” indi- 
cates a 32-bit Double Word, “i” indicates that the instruction 
specifies an integer size for the operand (B = Byte, 
W=Word, D = Double Word), “f” indicates that the instruc- 
tion specifies a Floating Point size for the operand (F = 32- 
bit Standard Floating, L= 64-bit Long Floating). 

The Returned Value Type and Destination column gives the 
size of any returned value and where the CPU places it. The 
PSR Bits Affected column indicates which PSR bits, if any, 
are updated from the Slave Processor Status Word ( Figure 
3-23). 

15 8 7 0 

00000000 NZFOOLOQ 

New PSR Bit Value(s) — ' — * J 

"Quit": Terminate Protocol, Tt-ap(FPU). / 

TL/EE/9424-28 

FIGURE 3-23. Slave Processor Status Word Format 

Any operand indicated as being of type “f” will not cause a 
transfer if the Register addressing mode is specified. This is 
because the Floating Point Registers are physically on the 
Floating Point Unit and are therefore available without CPU 
assistance. 

4.0 Device Specifications 

4.1 NS32CG16 PIN DESCRIPTIONS 

The following is a brief description of all NS32CG16 pins. 
The descriptions reference portions of the Functional De- 
scription, Section 3. 

Unless otherwise indicated, reserved pins should be left 
open. 

Note: An asterisk next to the signal name indicates a TRI-STATE condition 
for that signal during HOLD acknowledge. 

4.1.1 Supplies 

Vqcl Logic Power. 

+ 5V positive supply for on-chip logic. 

VCCCTTL, Buffers Power. 

VCCFCLK, + 5V positive supplies for on-chip output 
VCCAD, buffers. 

VCCIO 

VSSL Logic Ground. 

Ground reference for on-chip logic. 

VSSFCLK, Buffers Ground. 

VSSNTSC, Ground reference for on-chip output buffers. 

VSSHAD, 

VSSLAD, 

VSSIO 


4.1.2 Input Signals 

RSTI Reset Input. 

Schmitt triggered, asynchronous signal used to 
generate a CPU reset. See Section 3.3. 

Note: 

The reset signal is a true asynchronous input. Therefore, no 
external synchronizing circuit is needed. 

When RSTI changes right before the falling edge of CTTL, 
and meets the specified set-up time, it will be recognized on 
that falling edge. Otherwise it will be recognized on the fall- 
ing edge of CTTL in the following clock cycle. 

HOLD Hold Request. 

When active, causes the CPU to release the 
bus for DMA or multiprocessing purposes. See 
Section 3.5. 

Note: 

If the HOLD signal is generated asynchronously, its set up 
and hold times may be violated. In this case, it is recom- 
mended to synchronize it with CTTL to minimize the possibili- 
ty of metastable states. 

The CPU provides only one synchronization stage to mini- 
mize the HLDA latency. This is to avoid speed degradations 
in cases of heavy HOLD activity (i.e., DMA controller cycles 
interleaved with CPU cycles). 

INT Interrupt. 

A low level on this pin requests a maskable in- 
terrupt. INT must be kept asserted until the in- 
terrupt is acknowledged. 

NMI Non-Maskable Interrupt. 

A High-to-Low transition on this signal requests 
a non-maskable interrupt 
CWAIT Continuous Wait. 

Causes the CPU to insert continuous wait 
states if sampled low at the end of T2 and each 
following T-State. See Section 3.4.3. 

WAIT1-2 Two-Bit Wait State Inputs. 

These inputs, collectively called WAIT1-2, al- 
low from zero to three wait states to be speci- 
fied. They are binary weighted. See Section 
3.4.3. 

Note: During a DMA cycle, WAIT1-2 should be kept inactive 
unless they are also monitored by the DMA Controller. 
Wait states, in this case, should be generated through 
CWAIT. 

OSCIN Crystal/External Clock Input. 

Input from a crystal or an external clock source. 
See Section 3.2. 

4.1.3 Output Signals 

A16-A23 *High-Order Address Bits. 

These are the most significant 8 bits of the 
memory address bus. 

HBE ’High Byte Enable. 

Status signal used to enable data transfers on 
the most significant byte of the data bus. 
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4.0 Device Specifications (Continued) 

STO-3 Status. 

Bus cycle status code; STO is the least significant. 
Encodings are: 

0000 — Idle: CPU Inactive on Bus. 

0001 — Idle: WAIT Instruction. 

0010 — (Reserved) 

0011 — Idle: Waiting for Slave. 

0100 — Interrupt Acknowledge, Master. 

0101— Interrupt Acknowledge, Cascaded. 

0110— End of Interrupt, Master. 

0111— End of Interrupt, Cascaded. 

1000 — Sequential Instruction Fetch. 

1001— Non-Sequential Instruction Fetch. 

1010— Data Transfer. 

1011 — Read Read-Modify-Write Operand. 

1100 — Read for Effective Address. 

1101— Transfer Slave Operand. 

1110— Read Slave Status Word. 

1 1 1 1— Broadcast Slave ID. 

U/S User/Supervisor. 

User or Supervisor Mode status. High indicates 
User Mode; low indicates Supervisor Mode. 
ILO Interlocked Operation. 

When active, indicates that an interlocked oper- 
ation is being executed. 

HLDA Hold Acknowledge. 

Activated by the CPU in response to the HOLD 
input to indicate that the CPU has released the 
bus. 

PFS Program Flow Status. 

A pulse on this signal indicates the beginning of 
execution of an instruction. 

BPU BPU Cycle. 

This signal is activated during a bus cycle to 
enable an external BITBLT processing unit. The 
EXTBLT instruction activates this signal.* 
RSTO Reset Output. 

This signal becomes active when RSTI is low, 
initiating a system reset. 

RD Read Strobe. 

Activated during CPU or DMAC read cycles to 
enable reading of data from memory or periph- 
erals. See Section 3.4.2. 


TSO Timing State Out put. 

The falling edge of TSO identifies the beginning 
of state T2 of a bus cycle. The rising edge iden- 
tifies the beginning of state T4. 

DBE Data Buffers Enable. 

Used to control external data buffers. It is active 
when the data buffers are to be enabled. 
OSCOUT Crystal Output. 

This line is used as the return path for the crys- 
tal (if used). When an external clock source is 
used, OSCOUT should be left unconnected or 
loaded with no more than 5 pF of stray capaci- 
tance. 

FCLK Fast Clock. 

This clock is derived from the clock waveform 
on OSCIN. Its frequency is either the same as 
OSCIN or is lower, depending upon the scale 
factor programmed into the CFG register. See 
Section 3.2.1. 

PH1 1, PHI2 Two-Phase Clock. 

These outputs provide a two-phase clock with 
frequency half that of FCLK. They can be used 
to clock the DP8510/DP8511 BPU. The trace 
lengths of PHI1 and PHI2 should be shorter 
than 4 inches (10 centimeters) when connected 
to the BPU. 

CTTL System Clock. 

This clock is similar to PHI1 but has a much 
higher driving capability. The skew between its 
rising edge and PHI1 rising edge is kept to a 
minimum. 

4.1.4 Input-Output Signals 
ADO- 15 * Address/Data Bus. 

Multiplexed Address/Data information. Bit 0 is 
the least significant bit of each. 

SPC Slave Processor Control. 

Used by the CPU as the data strobe output for 
slave processor transfers; used by a slave proc- 
essor to acknowledge completion of a slave in- 
struction. See Section 3.4.7.I. 

DDIN ’Data Direction. 

Status signal indicating the direction of the d ata 
transfer during a bus cycle. During HOLD ac- 
knowledge this signal becomes anjnput and 
determines the activation of RD or WR. 


ADS ’Address Strobe 

Controls address latch es; sig nals the beginning 
of a bus cycle. During HOLD acknowledge this 
signal becomes an input and the CPU monitors 
it to detect the beginning of a DMA cycle and 
generate the relev ant strobe signals. When a 
DMA is used, ADS should be pulled up to Vqc 
through a 1 0 kfl resistor. 


WR Write Strobe. 

Activated during CPU or DMAC write cycles to 
enable writing of data to memory or peripherals. 

‘Note: BPU is low (Active) only during bus cycles involving 
pre-fetching instructions and execu tion o f EXT BLT 
operands. It is recommended that BPU. ADS and 
status lines (ST0-ST3) be used to quality BPU bus 
cycles . If a DMA circuit exists in the system, the 
HLDA signal should be used to further qualify BPU 
cycles. BPU may become active during T4 of a non- 
BPU bus cycle, and may become inactive during T4 
of a BPU bus cycle. BPU must be qualified by ADS 
and status lines (ST0-ST3) to be used as an exter- 
nal gating signal. 
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4.0 Device Specifications (Continued) 

4.2 ABSOLUTE MAXIMUM RATINGS 
If Military/Aerospace specified devices are required, 
please contact the National Semiconductor Sales 
Office/Distributors for availability and specifications. 

Temperature Under Bias 0°C to + 70°C 

Storage Temperature -65°Cto +150°C 


All Input or Output Voltages with 
Respect to GND - 0.5V to + 7V 

Note: Absolute maximum ratings indicate limits beyond 
which permanent damage may occur. Continuous operation 
at these limits is not intended; operation should be limited to 
those conditions specified under Electrical Characteristics. 


4.3 ELECTRICAL CHARACTERISTICS: T a = 0°Cto + 70°C,V C c = 5 V ±5%, GND = 0V 


Symbol 


V|H 


V|L 


V T+ 


Vhys 


V X L 


V X H 


v OH 


v OL 


■iLS 



Parameter 


High Level Input Voltage 


Low Level Input Voltage 


RSTI Rising Threshold Voltage 


RSTI Hysteresis Voltage 


OSCIN Input Low Voltage 


OSCIN Input High Voltage 


High Level Output Voltage 


Low Level Output Voltage 


SPC Input Current (low) 


Input Load Current 


Leakage Current 
Output and I/O Pins in 
TRI-STATE Input Mode 


Active Supply Current 


PH II , 2 High Level Output Voltage 


PHI1 , 2 Low Level Output Voltage 


Conditions 


(Note 4) 


(Note 3) 


V C c = 5.0V (Note 5) 


Vcc = 5.0V (Note 5) 



Iqh = -400 p,A (Note 6) 


Iql = 4 mA (Note 6) 


Vin = 0.4V, SPC in Input Mode 


0 ^ Vin ^ Vcc. All Inputs except SPC 


0.4 ^ Vqut ^ Vcc 



Iqut = o, Ta = 25'C (Note 2) 


Iqh = -400 n A 0.9 Vcc 


Iql = 4 mA 0.1 Vcc 


Note 1: Care should be taken by designers to provide a minimum inductance path between the Vss pins and system ground in order to minimize noise. 
Note 2: Ice is affected by the clock scaling factor selected by the C and M bits in the CFG register, see Section 3.2.1. 

Note 3: V||_ min — in the range of -0.5V to -1.5V, the pulse must be £ 20 ns, and the period between pulses ^ 120 ns. 

Note 4: V|h max — in the range of Vcc + 0.5V to Vcc + 2.0V, the pulse must be £ 25 ns, and the period between pulses s 120 ns. 

Note 5: Not 100% tested. 

Note 6: All outputs except PHI1 and PHI2. 
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4.0 Device Specifications (Continued) 


68-Pin PCC Package 


£ £ IS li is ^ II j£j g ^ 3 3 a I 5 Is 


10 11 12 13 U 15 16 17 IS 19 20 21 22 23 24 25 26 


60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 


i ii is is i 

i I 


=! £ 3 3 S 3 

3 Iq fc» fc» £ K 

3 8 8 

> > > 

Bottom View 


4.4 SWITCHING CHARACTERISTICS 


4.4.1 Definitions 

All the timing specifications given in this section refer to 
0.8V or 2.0V on the rising or falling edges of CTTL when the 
capacitive loading of CTTL is 100 pF, unless specifically 
stated otherwise. The timing specifications refer to 0.8 or 
2.0 V on the TTL output and input signals as illustrated in 
Figures 4-2 and 4-3 unless specifically stated otherwise. 


TT K> 

SS9 


FIGURE 4-1. Connection Diagram 


TL/EE/9424-29 



FIGURE 4-2. Timing Specification Standard 
(TTL Output Signals) 
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4.0 Device Specifications (Continued) 

ABBREVIATIONS: 

L.E. — leading edge R.E. — rising edge 
T.E. — trailing edge F.E. — falling edge 



TL/ EE/9424-31 

FIGURE 4-3. Timing Specification Standard 
(TTL Input Signals) 


4.4.2 DEVICE TESTING 

TEST EQUIPMENT 



TL/EE/9424-65 

FIGURE 4.4. Test Loading Configuration 


TABLE 4-1. Test Loading Characteristics 


Signal Name 

Capacitive 

Loading 

High Level 

Output Voltage 
(•oh = —400 ft A) 

Low Level 
Output Voltage 
(IOL = 4mA) 

Input Load 
Current 

(0 i V )N i V CC ) 

High Level 

Input Voltage 

Low Level 

Input Voltage 

HBE, STO-3, U/S, 

1G5,hlda,pF3, 

BPU.RSTO.RD, 

WR, TSO, DBE, 

FCLK, DBTR ADS 

50 pF 

2.0 V ^ Vqh ^ Vcc + 05V 

-0.5VSV O l^0.8V 

-20ftAil|i20 fiA 

2.0ViV| H ^Vcc+0-5V 

— 0.5ViV|i_i0.45V 

RSTI.HOLC.INT, 

NMI, CWAIT, WAIT1 -2 

50 pF 



-20ftAil|i20 fiA 

2.0V i V| H £ V C c + 0.5V 

— 0.5ViV| L i0.8V 

OSCIN 

50 pF 



-20 f>Ail|i20 fiA 

4.5V iV| H SV C c + 0.5V 

-0.5ViV| L i0.5V 

ADO-15, A16-23, 

CTTL 

100 pF 

2.0 V i Voh ^ V C c + 0.5V 

— 0.5V i Vol £ 0.8 V 

-20ftAil|i20 ftA 

2.4 V iV| H SV C c+ 0-5V 

-0.5ViV||.i0.45V 

PHI1, PHI2 

30 pF 

(Note 2) 

(Note 2) 




5PC 

30 pF 

2.0Vs:Voh^Vcc + 0.5V 

-0.5ViVo L i0.8V 

50 ftAil|i1.0 mA 

2.0V iV| H ^V C c + 0.5V 

-0.5ViV||.i0.4V 

OSCOUT 
(Note 1) 

see Table 

3-1 

2.0V ^ Vqh ^ Vqc + 0.5 V 

-0.5ViV O L^0.8V 





Not* 1: The maximum capacitive loading of OSCOUT Is given In Table 3-1 when the NS32CG1 6’s oscillator Is driven with a crystal. If a single phase clock source Is 
used, OSCOUT should be left unconnected or loaded with no more than 5 pF of stray capacitance. 

Not* 2: As stated In Table 4.4.3. 
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4.0 Device Specifications (Continued) 

4.4.3 Timing Tables 

4.4.3. 1 Output Signals: Internal Propagation Delays, NS32CG16-10 and NS32CG16-15 


Name 

Figure 

Description 

ISM 

4-20 

CTTL Clock Period 

tCTh 

4-20 

CTTL High Time 

At 1.5V (Both Edges) 

(see Notel) 

tCTI 

4-20 

CTTL Low Time 

tCTr 

4-20 

CTTL Rise Time 

tCTf 

4-20 

CTTL Fall Time 

tCLw(1,2) 

4-20 

PHI1.PHI2 Pulse Width 

tCLh 

4-20 

Clock High Time 

tnOVL(1,2) 

4-20 

PHI1.PHI2, Non-Overlap 
Time 

»XFr 

4-20 

OSCIN to FCLK 

R.E. Delay 

tFCr 

4-20 

FCLK to CTTL 

R.E. Delay 

tFCf 

4-20 

FCLK to CTTL 

F.E. Delay 

fpCr 

4-20 

CTTL and PHI1 Skew 

*ALv 


Address Bits 0-15 Valid 

*ALh 


Address Bits 0-15 Hold 

*AHv 

imp 

Address Bits 16-23 Valid 

UHh 

m 

Address Bits 16-23 Hold 

tALfr 

H 

Address Bits 0-1 5 
floating (during read) 

*ALnfr 


AD0-AD15 

Floating (Note 2) 


Reference/Conditions 


R.E., CTTL to Next R.E., CTTL 


25 pF-100 pF Capacitive Load 


At 0.8V 

25 pF-100 pF Capacitive Load 


0.8V to 2.0V V C c on R.E., CTTL 


2.0 V to 0.8V V C c on F.E., CTTL 


At 2.0V on PHI1.PHI2 
(Both Edges) 


At 90% V C c on PH1 1, PH 12 
(Both Edges) 


80% Vccon R.E., OSCIN 
to R.E., FCLK 


R.E., FCLK to R.E., CTTL 


R.E., FCLK to F.E., CTTL 


NS32CG16-10 


NS32CG16-15 
(Note 3) 


Min 

Max 

Min 

Max 

100 

1000 

66 

1000 

0.40 

0.57 

0.46 

0.58 

0.42 

0.56 

0.40 

0.53 


8 

0 

6 

0 

8 


6 

0.35 

0.55 

0.32 

0.53 

0.22 

0.50 

0.28 

0.50 


R.E., CTTL to R.E., PHI1 


after R.E., CTTL T1 


after R.E., CTTL T2 





Note 1: Device testing Is performed using the Test Loading Characteristics in Table 4.1 . Additional timing data for CTTL with various capacitive loads is not 100% 
tested. 

Note 2: tALnfr Is address bits 0-15 floating or not active after R.E. CTTL T1 . This is only valid if the previous CPU cycle was a read (Figure 4,5). A previous write 
may have “data" active into T1 of the next cycle which then becomes “address" during T1. 

Note 3: 15 MHz specifications are only guaranteed when tctp = 66 ns. 
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4.0 Device Specifications (Continued) 


4.4.3.1 Output Signals: Internal Propagation Delays, NS32CG16-10 and NS32CG16-15 (Continued) 


Name 

Figure 

Description 

Reference/Conditions 

NS32CG16-10 

NS32CG16-15 

Units 

Min 

Max 

Min 

Max 

l ALf 

B 

AD0-AD15 Floating 
(Caused by HOLD) 

after R.E., CTTL Ti 


25 

~ 1 

18 

ns 

*AHf 

4-7 

A16-A23 Floating 

after R.E., CTTL Ti 


25 


18 

ns 

*ALnf 

4-5, 4-8 

Address Bits 0-15 

Not Floating 

after R.E., CTTL TI 

B 

36 

B 

26 

ns 

l AHnf 

4-8 

Address Bits 16-23 

Not Floating 

after R.E., CTTL T4 

B 

36 

B 

26 

ns 

<Dv 

4-6, 4-10 

Data Valid (Write Cycle) 

after R.E., CTTL T2 orTI 


50 


38 

ns 

l Dh 

4-6, 4-10 

Data Hold 

after R.E., CTTL Next TI orTi 

0 


0 


ns 

l ADSa 

4-5 

ADS Signal Active 

after R.E., CTTL TI 

5 

35 

5 

26 

ns 

tADSia 

mm 

ADS Signal Inactive 

after F.E., CTTLT1 

5 

35 

5 

25 

ns 

l ADSw 

4-6 

ADS Pulse Width 

at 15% Vcc (Both Edges) 

30 


25 


ns 

tADSf 

4-7 

ADS Floating 

after R.E., CTTL Ti 


55 


40 

ns 

l ADSr 

4-8 

ADS Return from Floating 

after R.E., CTTL Ti 


55 


40 

ns 

l ALADSs 

4-6 

Address Bits 0-15 Setup 

before ADS T.E. 

25 


20 


ns 

‘AHADSs 

4-6 

Address Bits 16-23 Setup 

before ADS T.E. 

25 


20 


ns 

tALADSh 

■a 

Address Bits 0- 1 5 Hold 

after ADS T.E. 

12 


12 


ns 

tHBEv 

4-5 

HBE Signal Valid 

after R.E., CTTL TI 


60 


38 

ns 

l HBEh 

4-5 

HBE Signal Hold 

after R.E., CTTL Next TI or Ti 

0 


0 


ns 

l HBEf 

4-7 

HBE Signal Floating 

after R.E., CTTL Ti 


55 


40 

ns 

l HBEr 

4-8 

HBE Return from Floating 

after R.E., CTTL Ti 


55 


40 

ns 

tDDINv 

KO£l 

DDIN Signal Valid 

after R.E., CTTL TI 


65 


38 

ns 

l DDINh 

mm 

DDIN Signal Hold 

after R.E., CTTL Next TI or Ti 

0 


0 


ns 

l DDlNf 

4-7 

DDIN Floating 

after R.E., CTTL Ti 


55 


40 

ns 

^DDINr 

4-8 

DDIN Return from Floating 

after R.E., CTTL Ti 


55 


40 

ns 

l SPCa 

4-10 

SPC Output Active 

after R.E., CTTL TI 


35 

5 

26 

ns 

‘SPCia 

4-10 

SPC Output Inactive 

after R.E., CTTL T4 


35 

5 

26 

ns 

tSPCnf 

4-12 

SPC Output Non-Forcing 
(Note 2) 

after F.E., CTTLT4 


*CTp + 10 


tCTp + 8 

ns 

tHLDAa 

4-7 

HLDA Signal Active 

after R.E., CTTL Ti 


50 


26 

ns 

l HLDAia 

4-8 

HLDA Signal Inactive 

after R.E., CTTL Ti 


50 


26 

ns 

*STv 

B 

Status ST0-ST3 Valid 

after R.E., CTTL T4 
(before TI, see Note 1) 


45 


38 

ns 

*STh 

mm 

Status ST0-ST3 Hold 

after R.E., CTTL T4 

0 


0 


ns 

l BPUv 

4-5 

BPU Signal Valid 

after R.E., CTTL T4 


45 


30 

ns 

tBPUh 

4-5 

BPU Signal Hold 

after R.E., CTTLT4 

5 


5 


ns 


Note 1: Every memory cycle starts with T4, during which Cycle Status is applied. If the CPU was idling, the sequence will be: " . . . Ti, T4, T1 ... If the CPU was 
not idling, the sequence will be: “ . . . T4, TI ... 


Note 2: If the CPU is connected directly to the FPU and the CTTl loading is not violated, the CPU and FPU will function correctly together. The CPU and FPU 
connect directly without buffers. They should be located less than 4 inches (10 centimeters) apart, tspca and tspcia will track each other on all CPU’s and therefore 
it is not possible to have a minimum tspcia ar >d a maximum tspca value. The pulse width minimum, tspcw, of the FPU will not be violated by the NS32CG16 when 
connected directly to the FPU. 
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4.0 Device Specifications (Continued) 

4.4.3. 1 Output Signals: Internal Propagation Delays, NS32CG16-10and NS32CG16-15 (Continued) 


NS32CG16-10 


Name 

Figure 

Description 

Reference/Conditions 

tTSOa 

mm 

TSO Signal Active 

after R.E.,CTTLT2 

tTSOia 

4-5 

TSO Signal Inactive 

after R.E.,CTTLT4 

l RDa 

4-5 

RD Signal Active 

after R.E,CTTLT2 

l RDia 


RD Signal Inactive 

after R.E..CTTLT4 

%Ra 

4-6 

WR Signal Active 

after R.E.,CTTLT2 

tWRia 

mm 

WR Signal Inactive 

after R.E., CTTLT4 

tDBEa(R) 

4-5 

DBE Active (Read Cycle) 

after F.E..CTTLT2 

l DBEa(W) 

4-6 

DBE Active (Write Cycle) 

after R.E..CTTLT2 

^DBEia 

E23H2I 

DBE Inactive 

after F.E., CTTL T4 

tuSv 

mm 

U/S Signal Valid 

after R.E.,CTTLT4 

tUSh 

4-5 

U/S Signal Hold 

after R.E., CTTL T4 

tpFSa 

4-13 

PFS Signal Active 

after F.E., CTTL 

tpFSia 

4-13 

PFS Signal Inactive 

after F.E., CTTL 

tpFSw 

4-13 

PFS Pulse Width 

at 15% V C c (Both Edges) 

tNSPF 

4-16 

Nonsequential Fetch 
to Next PFS Clock Cycle 

after R.E., CTTL T1 

tpFNS 

4-15 

PFS Clock Cycle to 

Next Nonsequential Fetch 

before R.E., CTTLT1 

*LXPF 

4-14 

Last Operand Transfer 
of an Instruction to 

Next PFS Clock Cycle 

before R.E., CTTLT1 of 

First Bus Cycle of Transfer 

tlLOs 

4-17 

iLO Signal Setup 

before R.E., CTTL T 1 of 

First Interlocked Read Cycle 

t|LOh 

4-18 

ILO Signal Hold 

after R.E., CTTL T3 of Last 
Interlocked Write Cycle 

*ILOa 

4-19 

iLO Signal Active 

after R.E., CTTL 

l ILOia 

4-19 

ILO Signal Inactive 

after R.E., CTTL 

tRSTOa 

4-22 

RSTO Signal Active 

after R.E., CTTL 

tRSTOia 

4-22 

RSTO Signal Inactive 

after R.E., CTTL 

{ RTOI 

4-22 

Reset to Idle 

after F.E. of RSTO 

tRTOF 

4-22 

Reset to Fetch 

after R.E. of RSTO 


NS32CG16-15 

















































































































































































4.0 Device Specifications (Continued) 

4.4.3.2 Input Signal Requirements: NS32CG16-10 and NS32CG16-15 


Name 

Figure 

Description 

Reference/Conditions 

NS32CG16-10 | 

NS32CG16-15 | 

Units 

Min 

Max 

Min 

Max 

*xp 

4-20 

OSCIN Clock Period 

R.E., OSCIN to Next R.E., OSCIN 

50 

500 

33 

500 

ns 

txh 

4-20 

OSCIN High Time 
(External Clock) 

at 4.2V (Both Edges) 

16 


11 


ns 

txi 

4-20 

OSCIN Low Time 

at 1.0V (Both Edges) 

16 


11 


ns 

tDls 

4-5,4-11 

Data In Setup 

before R.E., CTTLT4 

18 


15 


ns 

tDlh 

4-5,4-11 

Data In Hold 
(see Note 1 ) 

after R.E..CTTLT4 

D 


n 


ns 

l CWs 

EI3 

CWAIT Signal Setup 

before R.E., CTTLT3 orT3(w) 

20 


20 


ns 

tcwh 


CWAIT Signal Hold 

after R.E..CTTLT3 orT3(w) 

5 


5 


ns 

l Ws 


WAITn Signals Setup 

before R.E., CTTL T3 or T3(w) 

20 


20 


ns 

l Wh 


WAITn Signals Hold 

after R.E., CTTL T3 or T3(w) 

5 


5 


ns 

tHLDs 

4-7, 4-8 

HOLD Setup Time 

before R.E.,CTTLTX2 orTi 

30 


22 


ns 

tHLDh 

4-7, 4-8 

HOLD Hold Time 

after R.E., CTTL Ti 

0 


0 


ns 

tpWR 

4-21 

Power Stable to RSTI R.E. 

after Vqc Reaches 4.5V 

50 


33 


JAS 

l RSTs 

4-21,4-22 

RSTI Signal Setup 

before F.E., CTTL 

20 


20 


ns 

iRSTw 

4-22 

RSTI Pulse Width 

at 0.8V (Both Edges) 

64 


64 


tCTp 

l SPCh 

4-12 

SPC Hold Time 
(see Note 3) 

after R.E., CTTL 

0 


0 


ns 

tlNTh 

4-23 

InT Signal Hold 

after Interrupt Acknowledge 


8 


8 

19 "'•] 

l NMIw 

4-24 

NMI Pulse Width 

at 0.8V (Both Edges) 

70 


50 


ns 

l SPCd 

4-12 

SPC Pulse Delay 
from Slave 

after F.E., CTTL T4 

2 


2 


l CTp 

tsPCs 

4-12 

SPC Input Setup 

before F.E., CTTL 

37 


30 


ns 

^ADSs 

4-9 

ADS Input Setup 

before F.E., CTTL 

15 


10 


ns 

tADSh 

H 

ADS Input Hold 
(see Note 2) 

after F.E., CTTL TI 

10 


10 


ns 

tDDINs 

4-9 

DDIN Input Setup 

before F.E., CTTL 

15 


10 


ns 

l DDINh 

4-9 

DDIN Input Hold 

after R.E., CTTL T4 

7 


5 


ns 


Note 1: toih is always less than or equal to tROia- 

Note 2: ADS must be deasserted before state T4 of the DMA controller cycle. 
Note 3: Not tested, guaranteed by design. 
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4.0 Device Specifications (Continued) 
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4.0 Device Specifications (Continued) 
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4.0 Device Specifications (Continued) 



Note 1: ADS must be deactivated before state T4 of the DMA controller cycle. 

Note 2: During a DMA cycle WAIT1 -2 must be kept i nactive unless they are monitored by the DMA Cont roller. A DMA cycle is similar to a CPU cycle. The 
NS32CG16 generates TSO, RD, WR and DBE. The DMAC drives the address/data lines HBE, ADS and DDIN. 

Note 3: During a DMA cycle, if the ADS signal is pulsed in order to initiate a bus cycle, the HOLD signal must remain asserted until state T4 of the DMAC cycle. 
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4.0 Device Specifications (Continued) 



I T1 I T4 I 




After transferring the last o peran d to the FPU, the CPU turns OFF the 
output driver and holds SPC high with an internal 5 kft pullup. 
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FIGURE 4-13. Relationship of PFS to Clock Cycles 
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4.0 Device Specifications (Continued) 



TL/EE/9424-45 

FIGURE 4-18. Relationship of TLO to Last Operand Cycle of an Interlocked Instruction 
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4.0 Device Specifications (Continued) 



TL/EE/9424-48 


ADO- 15, 
A16-23, 
SPC 



TL/EE/9424-49 


FIGURE 4-22. Non-Power-On Reset 
Note 1: During Reset the HOLD signal must be kept high. 

Note 2: After RSTI is deasserted the first bus cycle will be an instruction fetch at address zero. 
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FIGURE 4-23. INT Interrupt Signal Detection 

Note 1: Once InT is asserted, it must remain asserted until it is acknowledged. 

Note 2: INTA is the Interrupt Acknowledge bus cycle (not a CPU signal). Refer to Section 3.4.1 and Table 3.4. 


^ <NMIw jf 


FIGURE 4-24. NMI Interrupt Signal Timing 
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Appendix A: Instruction Formats 

NOTATIONS 

i = Integer Type Field 
B = 00 (Byte) 

W = 01 (Word) 

D = 1 1 (Double Word) 
f = Floating Point Type Field 
F = 1 (Std. Floating: 32 bits) 

L = 0 (Long Floating: 64 bits) 
op = Operation Code 

Valid encodings shown with each format, 
gen, gen 1 , gen 2 = General Addressing Mode Field 
See Sec. 2.3.2 for encodings, 
reg = General Purpose Register Number 
cond = Condition Code Field 

0000 = EQual: Z = 1 

0001 = Not Equal: Z = 0 

0010 = Carry Set: C = 1 

001 1 = Carry Clear: C = 0 

0100 = Higher: L = 1 

0101 = Lower or Same: L = 0 

0110 = Greater Than: N = 1 

0111 = Less or Equal: N = 0 

1000 = Flag Set: F = 1 

1001 = Flag Clear: F = 0 

1010 = LOwer: L = 0 and Z = 0 

1011 = Higher or Same: L = 1 or Z = 1 

1100 = Less Than: N = 0 and Z = 0 

1101 = Greater or Equal: N = 1 or Z = 1 

1110 = (Unconditionally True) 

1111 = (Unconditionally False) 
short = Short Immediate value. May contain 

quick: Signed 4-bit value, in MOVQ, ADDQ, CMPQ, 
ACB. 

cond: Condition Code (above), in Scond. 
areg: CPU Dedicated Register, in LPR, SPR. 

0000 = UPSR 

0001 - 0111 = (Reserved) 

1000 = FP 

1001 = SP 

1010 = SB 

1011 = (Reserved) 

1100 = (Reserved) 

1101 = PSR 

1110 = INTBASE 

1111 = MOD 

Options: in String Instructions 


T = Translated 
B = Backward 
U/W = 00: None 

01: While Match 
11: Until Match 


Configuration bits in SETCFG instruction: 


cond 1010 


op 001 0 


Format 1 


BSR 

-0000 

ENTER 

-1000 

RET 

-0001 

EXIT 

-1001 

CXP 

-0010 

NOP 

-1010 

RXP 

-0011 

WAIT 

-1011 

RETT 

-0100 

DIA 

-1100 

RETI 

-0101 

FLAG 

-1101 

SAVE 

-0110 

SVC 

-1110 

RESTORE 

-0111 

BPT 

-1111 


15 

8 7 

0 

I I I T" 

gen 

“II II 1 

short op 1 1 

3 


ADDQ 

CMPQ 

SPR 

Scond 

-000 

-001 

-010 

-011 

15 

ACB 

MOVQ 

LPR 

8 | ? 

-100 

-101 

-110 

i 


1 I 1 1 

gen 

op 1 

1 1 1 1 1 

1111 i 


Format 3 


CXPD 

-0000 

ADJSP 

-1010 

BICPSR 

-0010 

JSR 

-1100 

JUMP 

-0100 

CASE 

-1110 

BISPSR 

-0110 



Trap (UND) on XXXI, 1000 




15 

8 | 7 

1 


1 1 1 “1 

gen 1 

— i — i — 1 — i — 

gen 2 

III "1” 

op i 


Format 4 


ADD 

-0000 

SUB 

-1000 

CMP 

-0001 

ADDR 

-1001 

BIC 

-0010 

AND 

-1010 

ADDC 

-0100 

SUBC 

-1100 

MOV 

-0101 

TBIT 

-1101 

OR 

-0110 

XOR 

-1110 
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23 

16 15 

8 7 

0 

a 

a 


Format 5 


MOVS 

-0000 

BITWT 

-1000 

CMPS 

-0001 

TBITS 

-1001 

SETCFG 

-0010 

BBAND 

-1010 

SKPS 

-0011 

SBITPS 

-1011 

BBSTOD 

-0100 

BBFOR 

-1100 

EXTBLT 

-0101 

SBITS 

-1101 

BBOR 

-0110 

BBXOR 

-1110 

MOVMP 

-0111 



No Operation on 1 1 1 1 



23 

16 15 

8 7 

0 

1 1 1 1 

gen 1 

n i i i i 

gen 2 op 

T 1 1 1 T 

i 0 10 

1 1 1 1 

0 1110 


Format 6 


ROT 

-0000 

NEG 

-1000 

ASH 

-0001 

NOT 

-1001 

OBIT 

-0010 

Trap (UND) 

-1010 

CBITI 

-0011 

SUBP 

-1011 

Trap (UND) 

-0100 

ABS 

-1100 

LSH 

-0101 

COM 

-1101 

SBIT 

-0110 

IBIT 

-1110 

SBITI 

-0111 

ADDP 

-1111 


MOVif 

-000 

ROUND 

-100 

LFSR 

-001 

TRUNC 

-101 

MOVLF 

-010 

SFSR 

-110 

MOVFL 

-011 

FLOOR 

-111 


Trap (UND) 


ADDf 

MOVf 

CMPf 

(Note 3) 

SUBf 

NEGf 

Trap (UND) 
Trap (UND) 


1° 1 1 LJj 1 °l 

TL/EE/9424-53 


Format 10 

Always 


I °P I 

Format 1 1 


0 f 1 0 1 1 1 1 1 0 


-0000 

DIVf 

-1000 

-0001 

(Note 1) 

-1001 

-0010 

Trap (UND) 

-1010 

-0011 

Trap (UND) 

-1011 

-0100 

MULf 

-1100 

-0101 

ABSf 

-1101 

-0110 

Trap (UND) 

-1110 

-0111 

Trap (UND) 

-1111 


gen 1 gen 2 op 0 f 1 1 1 1 1 1 1 0 


MOVM 

-0000 

MUL 

-1000 


Format 12 


CMPM 

-0001 

MEI 

-1001 

(Note 2) 

-0000 

(Note 2) 

-1000 

INSS 

-0010 

Trap (UND) 

-1010 

(Note 1) 

-0001 

(Note 1) 

-1001 

EXTS 

-0011 

DEI 

-1011 

POLYf 

-0010 

Trap (UND) 

-1010 

MOVXBW 

-0100 

QUO 

-1100 

DOTf 

-0011 

Trap (UND) 

-1011 

MOVZBW 

-0101 

REM 

-1101 

SCALBf 

-0100 

(Note 2) 

-1100 

MOVZiD 

-0110 

MOD 

-1110 

LOGBf 

-0101 

(Note 1) 

-1101 

MOVXiD 

-0111 

DIV 

-1111 

Trap (UND) 

-0110 

Trap (UND) 

-1110 





Trap (UND) 

-0111 

Trap (UND) 

-1111 

23 

16115 

817 

0 

•Instructions with Format 12 are available only when the NS32381 is used. 


gen 1 gen 2 reg 


10 1110 


-0 00 INDEX 

-0 01 FFS 

-0 10 
-011 


Trap (UND) 


Format 13 

Always 


Trap (UND) on -1 lOand -1 11 


7 0 

'I I I I I I I I I 
10 0 11110 

TL/EE/9424-54 


1 0 0 0 1 1 1 1 0| 

TL/EE/9424-55 
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Note 1: Opcode not defined; CPU treats like MOVf. First operand has access class of read; second operand has access class of write; f-field selects 32-bit or 
64-bit data. 

Note 2: Opcode not defined; CPU treats like ADDf. First operand has access class of read; second operand has access class of read-modify-write. f-field selects 
32-bit or 64-bit data. 

Note 3: Opcode not defined; CPU treats like CMPf. First operand has access class of read; second operand has access class of read, f-field selects 32-bit or 64-bit 
data. 
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General Description 

The NS32381 is a second generation, CMOS, floating-point 
slave processor that is fully software compatible with its 
forerunner, the NS32081 FPU. The NS32381 FPU functions 
with National’s Embedded System Processors™, the 
NS32GX32 and the NS32CG16, and with any Series 32000 
CPU, from the NS32008 to the NS32532, in a tightly cou- 
pled slave configuration. The performance of the NS32381 
has been increased over the NS32081 by architecture im- 
provements, hardware enhancements, and higher clock fre- 
quencies. Key improvements include the addition of a 32-bit 
slave protocol, an early done algorithm to increase CPU/ 
FPU parallelism, an expanded register set, an automatic 
power down feature, expanded math hardware, and addi- 
tional instructions. 

The NS32381 FPU contains eight 64-bit data registers and 
a Floating-Point Status Register (FSR). The FPU executes 
20 instructions, and operates on both single and double- 
precision operands. Three separate processors in the 
NS32381 manipulate the mantissa, sign, and exponent. 
The CPU and NS32381 FPU form a tightly coupled comput- 
er cluster, which appears to the user as a single processing 
unit. The CPU and FPU communication is handled automati- 
cally, and is user transparent. 


The FPU is fabricated with National’s advanced double-met- 
al CMOS process. It is available in a 68-pin Pin Grid Array 
(PGA) package or 68-pin Plastic package. 

Features 

■ Compatible with NS32008, NS32016, NS32C016, 

NS32032, NS32C032, NS32332, NS32532, NS32CG16 
and NS32GX32 microprocessors 

■ Selectable 16-bit or 32-bit Slave Protocol 

■ Format compatible with IEEE Standard 754-1985 for 
binary floating point arithmetic 

■ Early done algorithm 

■ Single (32-bit) and double (64-bit) precision operations 
B Eight on-chip (64-bit) data registers 

■ Automatic power down mode 

■ Full upward compatibility with existing 32000 software 

■ High speed double-metal CMOS design 
D 68-pin PGA package 

■ 68-pin plastic package 


FPU Block Diagram 



Control 

Unit 


Execution 

Unit 


Interface 

and 

Storage Unit 
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FIGURE 1-1 
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1.0 Product Introduction 

The NS32381 Floating-Point Unit (FPU) provides high 
speed floating-point operations for the Series 32000 family, 
and is fabricated using National high-speed CMOS technol- 
ogy. It operates as a slave processor for transparent expan- 
sion of the Series 32000 CPU’s basic instruction set. The 
FPU can also be used with other microprocessors as a pe- 
ripheral device by using additional TTL and CMOS interface 
logic. The NS32381 is compatible with the IEEE Floating- 
Point Formats. 

1.1 IEEE FEATURES SUPPORTED-STANDARD 754-1985 

a) Basic floating-point number formats 

b) Add, subtract, multiply, divide and compare operations 

c) Conversions between different floating-point formats 

d) Conversions between floating-point and integer formats 

e) Round floating-point number to integer (round to near- 
est, round toward negative infinity and round toward 
zero, in double or single-precision) 

f) Exception signaling and handling (invalid operation, di- 
vide by zero, overflow, underflow and inexact) 

1.2 OPERAND FORMATS 

The N32381 FPU operates on two floating-point data 
types— single precision (32 bits) and double precision (64 
bits). Floating-point instruction mnemonics use the suffix F 
(Floating) to select the single precision data type, and the 
suffix L (Long Floating) to select the double precision data 
type. 

A floating-point number is divided into three fields, as shown 
in Figure 1-2. 

The F field is the fractional portion of the represented num- 
ber. In Normalized numbers (Section 1.2.1), the binary point 
is assumed to be immediately to the left of the most signifi- 
cant bit of the F field, with an implied 1 bit to the left of the 
binary point. Thus, the F field represents values in the range 
1.0 <x <2.0. 


TABLE 1-1. Sample F Fields 


F Field 

Binary Value 

Decimal Value 

000 ... 0 

1.000., 

,.o 

1.000... 0 

010. ..0 

1.010., 

,.o 

1.250... 0 

100... 0 

1.100., 

,.o 

1.500... 0 

110. ..0 

1.110., 

,.o 

1.750... 0 


T 

Implied Bit 

The E field contains an unsigned number that gives the bi- 
nary exponent of the represented number. The value in the 
E field is biased; that is, a constant bias value must be sub- 
tracted from the E field value in order to obtain the true 


exponent. The bias value is 01 1 ... 1 1 2, which is either 1 27 
(single precision) or 1023 (double precision). Thus, the true 
exponent can be either positive or negative, as shown in 
Table 1-2. 


TABLE 1-2. Sample E Fields 


E Field 

F Field 

Represented Value 

011 ., 

. .110 

100 ., 

..0 

1 . 5 X 2-1 = 0.75 

011 .. 

..111 

100 .. 

,.o 

1 . 5 X 2 ° = 1.50 

100 ., 

. .000 

100 ., 

,.o 

1 . 5 X 21 = 3.00 


Two values of the E field are not exponents. 1 1 ... 1 1 sig- 
nals a reserved operand (Section 1.2.3). 00... 00 repre- 
sents the number zero if the F field is also all zeroes, other- 
wise it signals a reserved operand. 

The S bit indicates the sign of the operand. It is 0 for posi- 
tive and 1 for negative. Floating-point numbers are in sign- 
magnitude form, that is, only the S bit is complemented in 
order to change the sign of the represented number. 

1.2.1 Normalized Numbers 

Normalized numbers are numbers which can be expressed 
as floating-point operands, as described above, where the E 
field is neither all zeroes nor all ones. 

The value of a Normalized number can be derived by the 
formula: 

(-I)S X 2(E-Bias) X (1 + F) 

The range of Normalized numbers is given in Table 1-3. 

1.2.2 Zero 

There are two representations for zero— positive and nega- 
tive. Positive zero has all-zero F and E fields, and the S bit is 
zero. Negative zero also has all-zero F and E fields, but its S 
bit is one. 

1.2.3 Reserved Operands 

The IEEE Standard for Binary Floating-Point Arithmetic pro- 
vides for certain exceptional forms of floating-point oper- 
ands. The NS32381 FPU treats these forms as reserved 
operands. The reserved operands are: 

• Positive and negative infinity 

• Not-a-Number (NaN) values 

• Denormalized numbers 

Both Infinity and NaN values have all ones in their E fields. 
Denormalized numbers have all zeroes in their E fields and 
non-zero values in their F fields. 

The NS32381 FPU causes an Invalid Operation trap (Sec- 
tion 2.1. 2.2) if it receives a reserved operand, unless the 
operation is simply a move (without conversion). The FPU 
does not generate reserved operands as results. 


Single Precision 

31 30 23 22 0 



1 8 23 


Double Precision 

63 62 52 51 0 



1 11 52 

FIGURE 1-2. Floating-Point Operand Formats 
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1.0 Product Introduction (Continued) 

TABLE 1-3. Normalized Number Ranges 


Single Precision 

Most Positive 2 127 X (2 - 2 - 23 ) 

= 3.40282346 X 1038 


Double Precision 

21023 x (2 - 2-52) 

= 1.7976931348623157 X 10308 


Least Positive 2 - 1 26 

= 1.17549436 X 10-38 


2-1022 

= 2.2250738585072014 X 10~308 


Least Negative 
Mo^t Negative 


-(2-126) 

= -1.17549436 X 10-38 
-2127 x (2 - 2 - 23) 

= -3.40282346 X 1Q38 


-( 2 - 1022 ) 

= -2.2250738585072014 X 10~308 

-21023 x (2 - 2-52) 

= -1.7976931348623157 X 1Q308 


Note: The values given are extended one full digit beyond their represented accuracy to help in generating rounding and conversion algorithms. 


1.2.4 Integers 

In addition to performing floating-point arithmetic, the 
NS32381 FPU performs conversions between integer and 
floating-point data types. Integers are accepted or generat- 
ed by the FPU as two’s complement values of byte (8 bits), 
word (16 bits) or double word (32 bits) length. 

See Figure 1-3 for the Integer Format and Table 1-4 for the 
Integer Fields. 

n-1 n-2 0 

S I 

FIGURE 1-3. Integer Format 


TABLE 1-4. Integer Fields 


s 

Value 

Name 

0 

1 

Positive Integer 

1 

1 - 2 n 

Negative Integer 


Note: n represents the number of bits in the word, 8 for byte, 1 6 for word 
and 32 for double-word. 


1.2.5 Memory Representations 

The NS32381 FPU does not directly access memory. How- 
ever, it is cooperatively involved in the execution of a set of 
two-address instructions with its Series 32000 Family CPU. 
The CPU determines the representation of operands in 
memory. 

In the Series 32000 family of CPUs, operands are stored in 
memory with the least significant byte at the lowest byte 


address. The only exception to this rule is the Immediate 
addressing mode, where the operand is held (within the in- 
struction format) with the most significant byte at the lowest 
address. 

2.0 Architectural Description 

2.1 PROGRAMMING MODEL 

The Series 32000 architecture includes nine registers that 
are implemented on the NS32381 Floating-Point Unit (FPU). 

2.1.1 Floating-Point Registers 

There are eight registers (L0-L7) on the NS32381 FPU for 
providing high-speed access to floating-point operands. 
Each is 64 bits long. A floating-point register is referenced 
whenever a floating-point instruction uses the Register ad- 
dressing mode (Section 2.2.2) for a floating-point operand. 
All other Register mode usages (i.e., integer operands) refer 
to the General Purpose Registers (R0-R7) of the CPU, and 
the FPU transfers the operand as if it were in memory. 

Note: These registers are all upward compatible with the 32-bit NS32081 
registers, (F0-F7), such that when the Register addressing mode is 
specified for a double precision (64-bit) operand, a pair of 32-bit reg- 
isters holds the operand. The programmer specifies the even register 
of the pair which contains the least significant half of the operand and 
the next consecutive register contains the most significant half. 

2.1.2 Floating-Point Status Register (FSR) 

The Floating-Point Status Register (FSR) selects operating 
modes and records any exceptional conditions encountered 
during execution of a floating-point operation. Figure 2-2 
shows the format of the FSR. 



32 H 

FSR I 


LSDW — ► l«ast significant double word 
MSDW — ► most significant double word 


-« 64 ► 

32 4- 32 ► 

FI /L0 MSDW 

F0/L0 LSDW 

LI MSDW 

LI LSDW 

F3/L2 MSDW 

F2/L2 LSDW 

L3 MSDW 

L3 LSDW 

F5/L4 MSDW 

F4/L4 LSDW 

L5 MSDW 

L5 LSDW 

F7/L6 MSDW 

F6/L6 LSDW 

L7 MSDW 

L7 LSDW 


FIGURE 2-1. Register Set 
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31 17 16 15 9876543210 


Reserved 

RMB 

SWF 

_l 1 1 1 1 

RM 

1 

IF 

IEN 

UF 

UEN 

TT 

1 1 


FIGURE 2-2. The Floating-Point Status Register 
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2.0 Architectural Description (Continued) 

2.1.2. 1 FSR Mode Control Fields 

The FSR mode control fields select FPU operation modes. 
The meanings of the FSR mode control bits are given be- 
low. 

Rounding Mode (RM): Bits 7 and 8. This field selects the 
rounding method. Floating-point results are rounded when- 
ever they cannot be exactly represented. The rounding 
modes are: 

00 Round to nearest value. The value which is nearest to 
the exact result is returned. If the result is exactly half- 
way between the two nearest values the even value 
(LSB = 0) is returned. 

01 Round toward zero. The nearest value which is closer 
to zero or equal to the exact result is returned. 

10 Round toward positive infinity. The nearest value which 
is greater than or equal to the exact result is returned. 

11 Round toward negative infinity. The nearest value 
which is less than or equal to the exact result is re- 
turned. 

Underflow Trap Enable (UEN): Bit 3. If this bit is set, the 
FPU requests a trap whenever a result is too small in abso- 
lute value to be represented as a normalized number. If it is 
not set, any underflow condition returns a result of exactly 
zero. 

Inexact Result Trap Enable (IEN): Bit 5. If this bit is set, 
the FPU requests a trap whenever the result of an operation 
cannot be represented exactly in the operand format of the 
destination. If it is not set, the result is rounded according to 
the selected rounding mode. 

2.1.2.2 FSR Status Fields 

The FSR Status Fields record exceptional conditions en- 
countered during floating-point data processing. The mean- 
ings of the FSR status bits are given below: 

Trap Type (TT): bits 0-2. This 3-bit field records any excep- 
tional condition detected by a floating-point instruction. The 
TT field is loaded with zero whenever any floating-point in- 
struction except LFSR or SFSR completes without encoun- 
tering an exceptional condition. It is also set to zero by a 
hardware reset or by writing zero into it with the Load FSR 
(LFSR) instruction. Underflow and Inexact Result are always 
reported in the TT field, regardless of the settings of the 
UEN and IEN bits. 

000 No exceptional condition occurred. 

001 Underflow. A non-zero floating-point result is too small 
in magnitude to be represented as a normalized float- 
ing-point number in the format of the destination oper- 
and. This condition is always reported in the TT field 
and UF bit, but causes a trap only if the UEN bit is set. 
If the UEN bit is not set, a result of Positive Zero is 
produced, and no trap occurs. 


010 Overflow. A result (either floating-point or integer) of a 
floating-point instruction is too great in magnitude to 
be held in the format of the destination operand. Note 
that rounding, as well as calculations, can cause this 
condition. 

01 1 Divide by zero. An attempt has been made to divide a 
. non-zero floating-point number by zero. Dividing zero 

by zero is considered an Invalid Operation instead 
(below). 

100 Illegal Instruction. Any instruction forms not included 
in the NS32381 Instruction Set are detected by the 
FPU as being illegal. 

101 Invalid Operation. One of the floating-point operands 
of a floating-point instruction is a Reserved operand, 
or an attempt has been made to divide zero by zero 
using the DIVf instruction. 

1 1 0 Inexact Result. The result (either floating-point or inte- 
ger) of a floating-point instruction cannot be repre- 
sented exactly in the format of the destination oper- 
and, and a rounding step must alter it to fit. This condi- 
tion is always reported in the TT field and IF bit unless 
any other exceptional condition has occurred in the 
same instruction. In this case, the TT field always con- 
tains the code for the other exception and the IF bit is 
not altered. A trap is caused by this condition only if 
the IEN bit is set; otherwise the result is rounded and 
delivered, and no trap occurs. 

1 1 1 (Reserved for future use.) 

Underflow Flag (UF): Bit 4. This bit is set by the FPU when- 
ever a result is too small in absolute value to be represented 
as a normalized number. Its function is not affected by the 
state of the UEN bit. The UF bit is cleared only by writing a 
zero into it with the Load FSR instruction or by a hardware 
reset. 

Inexact Result Flag (IF): Bit 6. This bit is set by the FPU 
whenever the result of an operation must be rounded to fit 
within the destination format. The IF bit is set only if no other 
error has occurred. It is cleared only by writing a zero into it 
with the Load FSR instruction or by a hardware reset. 
Register Modify Bit (RMB): Bit 16. This bit is set by the 
FPU whenever writing to a floating point data register. The 
RMB bit is cleared only by writing a zero with the LFSR 
instruction or by a hardware reset. This bit can be used in 
context switching to determine whether the FPU registers 
should be saved. 

2. 1.2.3 FSR Software Field (SWF) 

Bits 9-15 of the FSR hold and display any information writ- 
ten to them (using the LFSR and SFSR instructions), but are 
not otherwise used by FPU hardware. They are reserved for 
use with NSC floating-point extension software. 
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2.0 Architectural Description (Continued) 

2.2 INSTRUCTION SET 

2.2.1 Floating-Point Instruction Set 

This section describes the floating-point instructions execut- 
ed by the FPU in conjunction with the CPU. These instruc- 
tions form a subset of the Series 32000® instruction set and 
take 9, 1 1, and 12 encoding formats. A list of all the Series 
32000 instructions as well as details on their formats and 
addressing modes can be found in the appropriate CPU 
data sheets. 

Certain notations in the following instruction description ta- 
bles serve to relate the assembly language form of each 
instruction to its binary format in Figure 2-3. 
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FIGURE 2-3. Floating-Point Instruction Formats 

The Format column indicates which of the three formats in 
Figure 2-3 represents each instruction. 

The Op column indicates the binary pattern for the field 
called “op” in the applicable format. 

The Instruction column gives the form of each instruction as 
it appears in assembly language. The form consists of an 
instruction mnemonic in upper case, with one or more suffix- 
es (i or f) indicating data types, followed by a list of oper- 
ands (genl, gen2). 

An i suffix on an instruction mnemonic indicates a choice of 
integer data types. This choice affects the binary pattern in 
the i field of the corresponding instruction format as follows: 


Suffix i 

Data Type 

i Field 

B 

Byte 

00 

W 

Word 

01 

D 

Double Word 

11 


An f suffix on an instruction mnemonic indicates a choice of 
floating-point data types. This choice affects the setting of 
the f bit of the corresponding instruction format as follows: 

Suffix f Data Type f Bit 

F Single Precision 1 

L Double Precision (Long) 0 


An operand designation (genl , gen2) indicates a choice of 
addressing mode expressions. This choice affects the bina- 
ry pattern in the corresponding genl or gen2 field of the 
instruction format. Refer to Table 2-1 for the options avail- 
able and their patterns. 

Further details of the exact operations performed by each 
instruction are found in the Series 32000 Instruction Set 
Reference Manual. 

Movement and Conversion 

The following instructions move the genl operand to the 


gen2 operand, leaving the genl operand intact. 

Format 

Op 

Instruction 

Description 

11 

0001 

MOVf 

gen1,gen2 

Move without 
conversion 

9 

010 

MOVLF 

genl, gen2 

Move, converting 
from double 
precision to 
single precision. 

9 

011 

MOVFL 

genl, gen2 

Move, converting 
from single 
precision to 
double 
precision. 

9 

000 

MOVif 

genl, gen2 

Move, converting 
from any integer 
type to any 
floating-point 
type. 

9 

100 

ROUNDfi 

genl, gen2 

Move, converting 
from floating- 
point to the 
nearest integer. 

9 

101 

TRUNCfi 

genl, gen2 

Move, converting 
from floating- 
point to the 
nearest integer 
closer to zero. 

9 

111 

FLOORfi 

genl, gen2 

Move, converting 
from floating- 
point to the 
largest integer 
less than or 
equal to its 
value. 


Note: The MOVLF instruction f bit must be 1 and the i field must be 10. 
The MOVFL instruction f bit must be 0 and the i field must be 11. 


Arithmetic Operations 

The following instructions perform floating-point arithmetic 
operations on the genl and gen2 operands, leaving the re- 
sult in the gen2 operand. 

Note: POLY and DOT use the additional third implied operand. 

POLY and DOT put their result to LO/FO register and not to GEN2. 


Format 

Op 

Instruction 

Description 

11 

0000 

ADDf 

genl, gen2 

Add genl to gen2. 

11 

0100 

SUBf 

genl, gen2 

Subtract genl 
from gen2. 

11 

1100 

MULf 

genl, gen2 

Multiply gen2 by 
genl. 
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2.0 Architectural Description (Continued) 

Format Op Instruction Description 

11 1000 DIVf gen1,gen2 Divide gen2 by genl. 

11 0101 NEGf genl, gen2 Move negative of 

genl to gen2. 

11 1101 ABSf genl, gen2 Move absolute value 

of genl togen2. 

(N) 12 0100 SCALBf genl, gen2 Move gen2*29eni to 

gen2, for integral 
values of genl 
without computing 
296111 , 

(N) 12 0101 LOGBf genl, gen2 Move the unbiased 

exponent of genl to 
gen2. 

(N) 12 0011 DOTf genl, gen2 Move (genl *gen2) 

+ LOtoLO.n 

(N) 12 0010 POLYf genl, gen2 Move (LO’genl) + 

gen2 to LO.n 

Notes: 

(N): Indicates NEW instruction. 

(*)The third impled operand used by these instructions can be either F0 or 
L0 depending on whether ‘floating’ or ‘long’ data type is specified in the 
opcode. 

Comparison 

The Compare instruction compares two floating-point val- 
ues, sending the result to the CPU PSR Z and N bits for use 
as condition codes. See Figure 3- 1 1. The Z bit is set if the 
genl and gen2 operands are equal; it is cleared otherwise. 
The N bit is set if the genl operand is greater than the gen2 
operand; it is cleared otherwise. The CPU PSR L bit is un- 
conditionally cleared. Positive and negative zero are consid- 
ered equal. 

Format Op Instruction Description 

11 0010 CMPf gen1,gen2 Compare genl 

to gen2. 


Floating-Point Status Register Access 

The following instructions load and store the FSR as a 32- 
bit integer. 


Format 

Op 

Instruction 

Description 

9 

001 

LFSR 

genl 

Load FSR 

9 

110 

SFSR 

gen2 

Store FSR 


Note: All instructions support all of the NS32000 family data formats (for 
external operands) and all addressing modes are supported. 


Rounding 

The FPU supports all IEEE rounding options: Round toward 
nearest value or even significant if a tie. Round toward zero, 
Round toward positive infinity and Round toward negative 
infinity. 

2.3 EXCEPTIONS 

The FPU supports five types of exceptions: Invalid opera- 
tion, Division by zero, Overflow, Underflow and Inexact Re- 
sult. When an exception occurs, the FPU may or may not 
generate a trap depending upon the bit setting in the FSR 
Register. The user can disable the Inexact Result and the 
Underflow traps. If an undefined Floating-Point instruction is 
passed to the FPU an Illegal Instruction trap will occur. The 
user can’t disable trap on Illegal Instruction. 

Upon detecting an exceptional condition in executing a 
floating -poin t instruction, the FPU requests a TRAP by puls- 
ing the SPC line for one clock cycle, pulsing the S DN332 
line for two and a half clock cycles and pulsing the FSSR 
line for one clock cycle. (The user will connect the correct 
lines according to the CPU being used). 

In addition, the FPU sets the Q bit in the status word regis- 
ter. The CPU responds by reading the status word register 
(refer to Section 3.6.1 for its format) while applying status 
h’E (transferring status word) on the status lines. A trapped 
instruction returns no result (even if the destination is FPU 
register) and does not affect the CPU PSR. The FPU rec- 
ords exceptional cause in the trap type (TT) field of the FSR. 
If an illegal opcode is detected, the FPU sets the TS bit in 
the slave processor status word register, indicating a trap 
(UND). 

3.0 Functional Description 

3.1 POWER AND GROUNDING 

The NS32381 requires a single 5V power supply, applied on 
the Vcc pins. These pins should be connected together by 
a power (Vcc) plane on the printed circuit board. See Figure 
3-1. 

The grounding connections are made on the GND pins. 
These pins should be connected together by a ground 
(GND) plane on the printed circuit board. See Figure 3-1. 




PLCC Package 


TL/EE/91 57-43 


FIGURE 3-1. Recommended Supply Connections 
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3.0 Functional Description (Continued) 
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FIGURE 3*2. Power-On Reset Requirements 
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3.2 AUTOMATIC POWER DOWN MODE 

The NS32381 supports a power down mode in which the 
device consumes only 10% of its original power at 30 MHz. 
The NS32381 enters the power down mode (internal clocks 
are s topped with phase two high) if it does not receive an 
SPC pulse from the CPU within 256 clocks. 

The FPU exits the power do wn m ode and returns to normal 
operation after it receives an SPC from the CPU. There is no 
extra delay caused by the FPU being in the power down 
mode. 

3.3 CLOCKING 

The NS32381 FPU requires a single-phase TTL clock input 
on its CLK pin (pin A8). Different Clock sources can be used 
to provide the CLK signal depending on the application. For 
example, it can come from the BCLK of the NS32532 CPU. 
It can also come from the CTTL pin of the NS32C201 Tim- 
ing Control Unit, if it is required. 

3.4 RESETTING 

The RST pin serves as a reset for on-c hip lo gic. The FPU 
may be reset at any time by pulling the RST pin low for at 
least 64 clock cycles. Upon detecting a reset, the FPU ter- 
minates instruction processing, resets its internal logic, and 
clears the FSR to all zeroes. 

On application of power, RST must be held low for at least 
30 /as after Vcc is stable. This ensures that all on-chip volt- 
ages are completely stable before operation. See Figures 
3-2 and 3-3. 

“ JTJi_rL_n_rL 

i a 64 CLOCK j 

“ CYCLES " 

- m r 

TL/EE/9157-10 

FIGURE 3-3. General Reset Timing 

3.5 BUS OPERATION 

Instructions and operands are passed to the NS32381 FPU 
with slave processor bus cycles. Each bus cycle transfers 


either one byte (8 bits), one word (16 bits) or one double 
word (32 bits) to or from the FPU. During all bus cycles, the 
SPC line is driven by the CPU as an active low data strobe, 
and the FPU monitors pins ST0-ST3 to keep track of the 
sequence (protocol) established for the instruction being ex- 
ecuted. This is necessary in a virtual memory environment, 
allowing the FPU to retry an aborted instruction. 

3.5.1 Bus Cycles 

A bus cycle is initiated by the CP U, whi ch asserts the proper 
status on (ST0-ST3) and pulses SPC low. The status lines 
are s ampled by the FPU on the leading (falling) edge of the 
SPC pulse except for the 32532 CPU. When used with the 
32532 CPU, the status lines are sampled on the rising edge 
of CLK in the T2 state. If the transfer is from the FPU (a 
slave processor read cycle), th e FP U asserts data on the 
data bus for the duration of the SPC pulse. If the transfer is 
to the FPU (a slave processor write cycle), the FPU latches 
data from the data bus on the trailing (rising) edge of the 
SPC pulse. Figures 3-5, 3-6, 3-7 and 3-8 illustrate these 
sequences. 

The direction of the transfer and the role of the bidirectional 
SPC line ar e de termined by the instruction protocol being 
performed. SPC is always driven by the CPU during slave 
processor bus cycles. Protocol sequences for each instruc- 
tion are given in Section 3.6. 

3.5.2 Operand Transfer Sequences 

An operand is transferred in one or more bus cycles. For the 
16-Bit Slave Protocol a 1-byte operand is transferred on the 
least significant byte of the data bus (D0-D7). A 2-byte op- 
erand is transferred on the entire bus. A 4-byte or 8-byte 
operand is transferred in consecutive bus cycles, least sig- 
nificant word first. 

For the 32-Bit Slave Protocol a 4-byte operand is trans- 
ferred on the entire data bus in a single bus cycle and an 
8-byte operand is transferred in two consecutive bus cycles 
with the most significant byte transferred on data bits (D0- 
D7). The complete operand transfer of bytes B0-B7 where 
BO is the least significant byte would appear on the data bus 
as B4, B5, B6, B7 followed by BO, B1, B2, B3 in the second 
bus cycle. 
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3.0 Functional Description (Continued) 
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FIGURE 3-4a. System Connection Diagram with the NS32532 CPU 
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FIGURE 3-4b. System Connection Diagram with the NS32332 CPU 
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3.0 Functional Description (Continued) 
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FIGURE 3-4c. System Connection Diagram with the NS32008, NS32016 or NS32032 CPU 
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FIGURE 3-4d. System Connection Diagram with the NS32CG16 CPU 
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3.0 Functional Description (Continued) 
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3.0 Functional Description (Continued) 



Note 1: FPU samples CPU status here. 

Note 2: FPU samples data bus here. 

FIGURE 3*7. Slave Processor Write Cycle (NS32008, NS32016, NS32032 and NS32332 CPU) 
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Note 1; FPU samples CPU status here. 

Note 2: FPU samples data bus here. 

FIGURE 3*8. Slave Processor Write Cycle (NS32532 CPU) 
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3.0 Functional Description (Continued) 

3.6 INSTRUCTION PROTOCOLS 
3.6.1 General Protocol Sequences 

The NS32381 supports both the 16-bit and 32-bit General 
Slave protocol sequences. See Tables 3-1, 3-2 and Figures 
3-12, 3-13 respectively. 

Slave Processor instructions have a three-byte Basic In- 
struction field, consisting of an ID byte followed by an Oper- 
ation Word. See Figure 3-9 for the ID and Opcode format 
1 6-bit Slave Protocol and Figure 3- 10 for the ID and Opcode 
Format 32-bit Slave Protocol. The ID Byte has three func- 
tions: 

1) It identifies the instruction to the CPU as being a Slave 
Processor instruction. 


2) It specifies which Slave Processor will execute it. 

3) It determines the format of the following Operation Word 
of the instruction. 

Upon receiving a slave processor instruction, the CPU initi- 
ates a sequence outlined in either Table 3-1 or 3-2, depend- 
ing on the PSO and PS1, to allow for the 16-bit or 32-bit 
slave protocol. The NS32008, NS32016, NS32C016, 
NS32032, NS32C032 and NS32CG16 all communicate with 
the NS32381 using the 16-bit Slave Protocol. The NS32332, 
NS32532 and NS32GX32 CPUs communicate with the 
NS32381 using a 32-bit Slave Protocol; a different version is 
provided for each CPU. 


TABLE 3*1. 16-Bit General Slave Instruction Protocol 


Step 

Status 

Action 

1 

ID (1111) 

CPU sends ID Byte 

2 

OP (1101) 

CPU sends Operation Word 

3 

OP (1101) 

CPU sends required operands (if any) 

4 

— 

Slaves starts execution (CPU prefetches) 

5 

— 

Slave pulses SPC low 

6 

ST (11 10) 

CPU Reads Status Word 

7 

OP (1101) 

CPU Reads Result (if destination is 
memory and if no TRAP occurred) 


TABLE 3*2. 32-Bit General Slave Instruction Protocol 


Step 

Status 

Action 

1 

ID (1 1 1 1) 

CPU sends ID and Operation Word 

2 

OP (1101) 

CPU sends required operands (if any) 

3 

— 

Slaves starts execution (CPU prefetches) 

4 

— 

Slave signals DONE or TRAP or CMPf 

5 

ST (1110) 

CPU Reads Status Word (If TRAP was signaled 
or a CMPf instruction was executed) 

6 

OP (1101) 

CPU Reads Result (if destination is memory and 
if no TRAP occurred) 


TABLE 3-3. Floating-Point Instruction Protocols 


Mnemonic 

Operand 1 
Class 

Operand 2 
Class 

Operand 1 
Issued 

Operand 2 
Issued 

Returned Value 

Type and Destination 

PSR Bits 
Affected 

ADDf 

read.f 

rmw.f 

f 

f 

f to Op. 2 

none 

SUBf 

read.f 

rmw.f 

f 

f 

f to Op. 2 

none 

MULf 

read.f 

rmw.f 

f 

f 

f to Op. 2 

none 

DIVf 

read.f 

rmw.f 

f 

f 

f to Op. 2 

none 

MOVf 

read.f 

write.f 

f 

N/A 

f to Op. 2 

none 

ABSf 

read.f 

write.f 

f 

N/A 

f to Op. 2 

none 

NEGf 

read.f 

write.f 

f 

N/A 

f to Op. 2 

none 

CMPf 

read.f 

read.f 

f 

f 

N/A 

N,Z,L 

FLOORfi 

read.f 

write.i 

f 

N/A 

i to Op. 2 

none 

TRUNCfi 

read.f 

write.i 

f 

N/A 

i to Op. 2 

none 

ROUNDfi 

read.f 

write.i 

f 

N/A 

i to Op. 2 

none 

MOVFL 

read.F 

write.L 

F 

N/A 

L to Op. 2 

none 

MOVLF 

read.L 

write.F 

L 

N/A 

F to Op. 2 

none 

MOVif 

read.i 

write.f 

i 

N/A 

f to Op. 2 

none 

LFSR 

read.D 

N/A 

D 

N/A 

N/A 

none 

SFSR 

N/A 

write. D 

N/A 

N/A 

D to Op. 2 

none 

SCALBf 

read.f 

rmw.f 

f 

f 

f to Op.2 

none 

LOGBf 

read.f 

write.f 

f 

N/A 

f toOp.2 

none 

DOTf 

read.f 

read.f 

f 

f 

*f to F0/L0 

none 

POLYf 

read.f 

read.f 

f 

f 

*f to F0/L0 

none 


D = Double Word 


i = Integer size (B, W, D) specified in mnemonic, 
f = Floating-Point type (F, L) specified in mnemonic. 

N/A = Not Applicable to this instruction. 

•The "returned value” can go to either F0 or L0 depending on the “f" bit in the opcode, i.e., whether “floating” or "long” data type is used. 
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3.0 Functional Description (Continued) 
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FIGURE 3-9. ID and OPCODE Format 
16-Bit Slave Protocol 
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FIGURE 3-10. ID and OPCODE Format 
32-Bit Slave Protocol 


For the 16-bit Slave Protocol the CPU applies Status Code 
1111 (Broadcast ID), and sends the ID Byte on the least 
significant half of the Data Bus (D0-D7). The CPU next 
sends the Operation Word while applying Status Code 1 1 01 
(Transfer Slave Operand). The Operation Word is swapped 
on the Data Bus; that is, bits 0-7 appear on pins D8-D15, 
and bits 8-15 appear on pins D0-D7. 

For the 32-bit Slave Protocol the CPU applies Status Code 
1111 and sends the ID Byte (different ID for each format) in 
byte 3 (D24-D31) and the Operation Word in bytes 1 and 2 
in a single double word transfer. The Operation Word is 
swapped such that OPCODE low appears on byte 2 (D16- 
D23) and OPCODE high appears on byte 1 (D8-D15). Byte 
0 (D0-D7) is not used. 

All Slave Processors input and decode the data from these 
transfers. The Slave Processor selected by the ID Byte is 
activated and from this point on the CPU is communicating 
with it only. If any other slave protocol is in progress (e.g., an 
aborted Slave instruction), this transfer cancels it. Both the 
CPU and FPU are aware of the number and size of the 
operands at this point. 

Using the Addressing Mode fields within the Operation 
Word, the CPU starts fetching operands and issuing them to 
the FPU. To do so, it references any Addressing Mode ex- 
tensions appended to the FPU instruction. Since the CPU is 
solely responsible for memory accesses, these extensions 
are not sent to the Slave Processor. The Status Code ap- 
plied is 1101 (Transfer Slave Processor Operand). 

After the CPU has issued the last operand, the FPU starts 
the a ctual execution of the instruction. A one clock cycle 
SPC pulse is used to indicate the completion of the instruc- 


tion and for the CPU to continue with the 16-Bit Slave Proto- 
col by reading the FPU’s Status Word Register. 

For the 32-bit Slave Protocol, upon completion of the in- 
struction, the FPU will signal the CPU by pulsing either 
SDNXXX or FSSR (Force Slave Status Read). 

A half clock cycle S DN332 p ulse with a NS32332 CPU, or a 
one clock cycle SDN532 pulse with a NS32532 or 
NS32GX32 CPU, indicates a valid completion of the instruc- 
tion and that there is no need for the CPU to read its Status 
Word Register. 

But if there is a need for the CPU to read FPU’s Status Word 
Register, a two and a half clock cycle SDN332 (from 
NS32332) or a one clock cycle FSSR pulse (from NS32532 
or NS32GX32) will be issued instead. 

In all cases for bo th th e 16-Bit and 32-Bit Slave Protocols 
the CPU will use SPC to read the Status Word from the 
FPU, while applying status code (1110). This word has the 
format shown in Figure 3-11. If the Q bit (“Quit”, Bit 0) is set, 
this indicates that an error (TRAP) has been detected by the 
FPU. The CPU will not continue the protocol, but will imme- 
diately trap through the Slave vector in the Interrupt Table. If 
the instruction being performed is CMPf (Section 2.2.3) and 
the Q bit is not set, the CPU loads Processor Status Regis- 
ter (PSR) bits N, Z and L from the corresponding bits in the 
FPU Status Word. The FPU always sets the L bit to zero. 
The last step will be for the CPU to read the result, provided 
there are no errors and the resu lts destination is in memory. 
Here again the CPU uses SPC to read the result from the 
FPU and transfer it to its destination. These Read cycles 
from the FPU are performed by the CPU while applying 
Status Code 1101 (Transfer Slave Operand). 


,31 15 7 0 . 


r : 

ZERO 

TS 

ZERO 

N 

z 

0 

0 

0 

L 

0 

Q 


Bit 

(0) 

Q: 

Description 

Set to “1 ” if an FPU TRAP (error) occurred. 

(2) 

L: 

Cleared to ‘0” by a valid CMPf. 

Cleared to “0” by the FPU. 

(6) 

Z: 

Set to “1 ” if the second operand is equal to 

(7) 

N: 

the first operand. Otherwise it is cleared to 
“0”. 

Set to “1 ” if the second operand is less than 

(15) 

TS: 

the first operand. Otherwise it is cleared to 
“0”. 

Set to “1 ” if the TRAP is (UND) and cleared tc 


"0” if the TRAP is (FPU). 

FIGURE 3-11. FPU Status Word Format 
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3.0 Functional Description (Continued) 
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FIGURE 3-12. 16-Bit General Slave Instruction Protocol: FPU Actions 



FIGURE 3-13. 32-Bit General Slave Instruction Protocol: FPU Actions 
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3.0 Functional Description (Continued) 

3.6.2 Early Done Algorithm 

The NS32381 has the ability to modify the General Slave 
protocol sequences and to boost the performance of the 
FPU by 20% to 40%. This is called the Early Done Algo- 
rithm. 

Early Done is defined by the fact that the destination of an 
instruction is an FPU register and that the instruction and 
range of operands cannot generate a TRAP ( error). Wh en 
these conditions are met the FPU will send a SDNXXX or 
SPC pulse after receiving all of the operands from the CPU 
and before executing the instruction. Hence this becomes 
an early done as compared to the General Slave Protocols. 
In the case of the 16-bit Slave Protocol in which the CPU 
always reads the slave status word, the FPU will force all 
zeroes to be read. The CPU can then send the next instruc- 
tion to the FPU and save the general protocol overhead. 
The FPU will start the new instruction immediately after fin- 
ishing the previous instruction. 

SFSR, CMPF and CMPL do not generate an Early Done. 

3.6.3 Floating-Point Protocols 

Table 3-3 gives the protocols followed for each floating- 
point instruction. The instructions are referenced by their 
mnemonics. For the bit encodings of each instruction, see 
section 2.2.3. 

The Operand Class columns give the Access Classes for 
each general operand, defining how the addressing modes 
are interpreted by the CPU (see Series 32000 Instruction 
Set Reference Manual). 

The Operand Issued columns show the sizes of the oper- 
ands issued to the Floating-Point Unit by the CPU. “D” indi- 
cates a 32-bit Double Word, “i” indicates that the instruction 
specifies an integer size for the operand (B = Byte, W = 
Word, D = Double Word), “f” indicates that the instruction 
specifies a floating-point size for the operand (F = 32-bit 
Standard Floating, L = 64-bit Long Floating). 

The Returned Value Type and Destination column gives the 
size of any returned value and where the CPU places it. The 
PSR Bits Affected column indicates which PSR bits, if any, 
are updated from the FPU Status Word ( Figure 3-11). 

Any operand indicated as being of type “f” will not cause a 
transfer if the Register addressing mode is specified, be- 
cause the Floating-Point Registers are physically on the 
Floating-Point Unit and are therefore available without CPU 
assistance. 

4.0 Device Specifications 

4.1 PIN DESCRIPTIONS 

4.1.1 Supplies 

The following is a brief description of all NS32381 pins. 
Vcc Power: + 5V positive supply. 

GND Ground: Ground reference for both on-chip log- 
ic and drivers connected to output pins. 


4.1.2 Input Signals 

CLK Clock: TTL-level clock signal. 

*DDIN Data Direction In: Active low. Status signal indi- 
cating the direction of data transfers during a 
bus cycle. 

ST0-ST3 Status: Bus cycle status code from CPU. ST0 is 
the least significant and rightmost bit. 

1100 — Reserved 

1101 — Transferring Operation Word or Oper- 
and 

1 110 — Reading Status Word 

1 1 1 1— Broadcasting Slave ID 

Note: The NS32332 generates four status lines and the 
NS32532 generates five. The user should connect the 
status lines as shown below: 


NS32381 

NS32332 

NS32532 

ST0 

ST0 

ST0 

ST1 

ST1 

ST1 

ST2 

ST2 

ST2 

STS 

ST3 

ST4 

Reset: Active 

low. Resets the last operation 


and clears the FSR register, 

NOE New Opcode Enable: Active high. This signal 
enables the new opcodes available in the 
NS32381. 

PS0, PS1 Protocol Select: Selects the slave protocol to 
be used. PS0 is the least significant and right- 
most bit. 

00 — Selects 16-bit protocol. 

01 — Selects 32-bit protocol for NS32332. 

10— Reserved. 

11 — Selects 32-bit protocol for NS32532. 

4.1.3 Output Signals 

SDN332 Slave Done 332: Active low. This signal is for 
use with the NS32332 CPU only. If held active 
for a half clock cycle and released this pin indi- 
cates the successful completion of a floating- 
point instruction by the FPU. Holding this pin 
active for two and a half clock cycles indicates 
TRAP or that the CMPf instruction has been ex- 
ecuted. 

SDN532 Slave Done 532: Active low. This signal is for 
use with the NS32532 CPU only. When active it 
indicates successful completion of a floating- 
point instruction by the FPU. 

FSSR Force Slave Status Read: Active low. This sig- 
nal is for use with the NS32532 CPU only. 
When active it indicates TRAP or that the CMPf 
instruction has been executed. 

4.1.4 Input/Output Signals 

*D0-D31 Data Bus: These are the 32 signal lines which 
carry data between the NS32381 and the CPU. 

SPC Slave Processor Control: Active low. This is the 
data strobe sig nal f or slave transfers. For the 
32-bit protocol, SPC is only an input signal. 

•For the 16-bit Slave Protocol the upper sixteen data input signals (D16- 

D31) and DDIN should be left floating. 
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4.0 Device Specifications (Continued) 

Connection Diagrams 

L 
K 
J 
H 
G 
F 
E 
D 
C 

a 

A 

1 23456789 10 11 


®©@®®®@®@ 


®©®®®®®®®©® 

® ® 


© ® 

© © 


© © 

© © 


© © 

© © 

NS32381 

© © 

® © 


® © 

© © 


© © 


© © 

®©@®®®@®®©@ 


®©®@©®®®© 


TL/EE/9157-18 


Bottom View 

Order Number NS32381 
See NS Package Number U68D 

FIGURE 4-1. 68-Pin PGA Package 
NS32381 Pinout Descriptions 


Desc 

Pin 

V CC 

A2 

D1 

A3 

DO 

A4 

PS1 (Note 1) 

A5 

GND 

A6 

GND 

A7 

CLK 

A8 

RST 

A9 

Reserved (Note 2) 

A10 

Reserved (Note 2) 

B1 

D2 

B2 

D17 

B3 

D16 

B4 

PSO (Note 1) 

B5 

GND 

B6 

NOE (Note 1) 

B7 

Reserved (Note 3) 

B8 

Reserved (Note 2) 

B9 

Vcc 

BIO 

D15 

B11 

D18 

Cl 

D3 

C2 

D31 

CIO 

D14 

C11 

D19 

D1 

V CC 

D2 

D30 

DIO 

v cc 

Dll 

D4 

El 

D20 

E2 

D13 

E10 

D29 

Ell 

Reserved (Note 3) 

FI 

D5 

F2 


Note 1: CMOS input; never float. 
Note 2: Pin should be grounded. 
Note 3: Pin should be left floating. 


Desc 

Pin 

D28 

F10 

GND 

F11 

GND 

G1 

D21 

G2 

D12 

G10 

D27 

G11 

D6 

HI 

D22 

H2 

Dll 

H10 

SDN332 

H11 

D7 

J1 

D23 

J2 

SPC 

J10 

SDN532 

J11 

Vcc 

K1 

D8 

K2 

GND 

K3 

D26 

K4 

GND 

K5 

Vcc 

K6 

Reserved (Note 3) 

K7 

STO 

K8 

ST1 

K9 

Reserved (Note 3) 

K10 

GND 

K11 

D24 

L2 

D25 

L3 

D9 

L4 

DIO 

L5 

DDIN 

L6 

Vcc 

L7 

ST2 

L8 

ST3 

L9 

FSSR 

L10 


3 
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4.0 Device Specifications (Continued) 
Connection Diagrams (Continued) 
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Bottom View 

Order Number NS32381V-15, NS32381V-20, NS32381V-25 or NS32381V-30 
See NS Package Number V68 


FIGURE 4-2. 68-Pln Plastic Chip Carrier Package 


Note 1: All these pins should be left open. 
Note 2: All these pins should be grounded. 
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4.0 Device Specifications (Continued) 

4.2 ABSOLUTE MAXIMUM RATINGS 
If Military/Aerospace specified devices are required, 
please contact the National Semiconductor Sales 
Office/Distributors for availability and specifications. 

Maximum Case Temperature 95°C 

Storage T emperature — 65°C to +1 50°C 


All Input or Output Voltages 

with Respect to GND -0.5Vto+7.0V 

ESD Rating 2000V (in human body model) 

Note: Absolute maximum ratings indicate limits beyond 
which permanent damage may occur. Continuous operation 
at these limits is not intended; operation should be limited to 
those conditions specified under Electrical Characteristics. 


4.3 ELECTRICAL CHARACTERISTICS T A = 0°C to 70°C, V C c = 5V ± 5%, GND = 0V 


Symbol 


V|H 


V,L 


VqH 


VOL 


Conditions 



High Level Input Voltage* 


Low Level Input Voltage* 


High Level Output Voltage 


Low Level Output Voltage 


Input Load Current* 


High Level Input Voltage 
for PSO, PS1.NOE 


Low Level Input Voltage 
for PSO, PS1.NOE 


Input Load Current 
for PSO, PS1, NOE 


Leakage Current 
(Output and I/O Pins 
in TRI-STATE®/lnput Mode) 


Active Supply Current 


Power Down Current 



0 ^ V| N :£ Vqc 


0.4 :S Vqut ^ 2.4V 


•OUT = 0, T a = 25°C, Vcc = 5V 


l0UT “ 0, T A = 25°C, Vcc = 5V 


•Except PSO, PS1 . NOE and Reserved pins. 

Note: PSO, PS1 NOE pins have to be connected to either GND or Vcc (possible via resistor) as it is shown in Figure 3-4a, 3-4b, 3-4c, and 3-4d. 


4.4 SWITCHING CHARACTERISTICS 
4.4.1 Definitions 

All the Timing Specifications given in this section refer to 
0.8V and 2.0V on all the input and output signals as illustrat- 
ed in Figures 4.3 and 4.4, unless specifically stated other- 
wise. 


ABBREVIATIONS 

L.E. — Leading Edge 
T.E. — Trailing Edge 


R.E. — Rising Edge 
F.E. — Falling Edge 




TL/EE/91 57-20 

FIGURE 4-4. Timing Specification Standard 
(Signal Valid before Clock Edge) 


TL/EE/9157-19 

FIGURE 4-3. Timing Specification Standard 
(Signal Valid after Clock Edge) 
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4.0 Device Specifications (Continued) 

4.4.2 Timing Tables (Maximum times assume temperature range 0 D C to 70°C) 

4.4.2. 1 Output Signal Propagation Delays for all CPUs (16-Blt Slave Protocol) 

(Maximum times assume capacitive loading of 100 pF) 


Symbol 

Figure 

Description 

Reference/ 

Conditions 

tSPCF w 

4-18 

§PC Pulse Width 
from FPU 

At 0.8V 
(Both Edges) 

tSPCFg 

4-18 

SPC Output Active 

After CLK R.E. 

ISPCFia 

4-18 

SPC Output Inactive 

After CLK R.E. 

tsPCF f (1) 

4-18 

SPC Output Floating 

After CLK F.E. 


NS32381-15 


NS32381-20 


NS32381-25 



tCLKn - 10 tCLKo + 10 tCLKr, “ 10 tCLK n + 1° 1CLK„ “ 1° 1CLK„ + 10 ns 



4. 4. 2. 2 Output Signal Propagation Delays for the NS32008, NS32016 and NS32032 CPUs 

Maximum times assumes capacitive loading of 100 pF 



Description 


D0-D15 Floating 


Reference/ 

Conditions 


NS32381-15 NS32381-20 


NS32381-25 


Data Valid (D0-D1 5) After SPCL.E. 


After SPCT.E. 



4.4.2. 3 Output Signal Propagation Delays for the 32-Bit Slave Protocol NS32332 CPU 

Maximum times assume capacitive loading of 1 00 pF unless otherwise specified 


Symbol 


1d v 



Figure 

Description 

Reference/ 

Conditions 

4-10 

Data Valid 

After SPC L.E.; 

75 pF Cap. Loading 

4-10 

Data Hold 

After SPCT.E. 

4-10 

Data Floating 

After SPCT.E. 

4-12,13 

Slave Done Active 

After CLK F.E. 

4-13 

Slave Done Hold 

After CLK R.E. 

4-12 

Slave Done 

Pulse Width 

At 0.8V 
(Both Edges) 

4-12,13 

Slave Done Floating 

After CLK R. E. 

4-13 

Slave Done (TRAP) 
Pulse Width 

At 0.8V 
(Both Edges) 


NS32381-15 



2% tCLKp — 1 0 


2Y2tcLK n +1° 
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4.0 Device Specifications (Continued) 


4.4. 2. 4 Output Signal Propagation Delays for the 32-Bit Slave Protocol NS32532 CPU 

Maximum times assume capacitive loading of 50 pF 





Reference/ 

NS32381- 


Symbol 

Figure 

Description 

Conditions 

20 

25 

30 

Units 





Min 

Max 

Min 

Max 

Min 

Max 


tDv 

4-14 

Data Valid 

After SPCL.E. 


35 


35 


35 

ns 

tDh 

4-14 

Data Hold 

After CLKR.E. 

3 


3 


3 


ns 


4-14 

Data Floating 

After SPCT.E. 


30 


30 


30 

ns 

!!■ 

4-16 

Slave Done Active 

After CLKR.E. 


35 


25 


20 

ns 

l SD h 

4-16 

Slave Done Hold 

After CLK R.E. 

2 

33 

2 

25 

2 

20 

ns 

tSDf (1) 

4-16 

Slave Done Floating 

After CLKR.E. 


30 


30 


30 

ns 

tFSSR a 

4-17 

Forced Slave Status 
Read Active 

After CLK R.E. 

■ 

35 

■ 

25 

■ 

20 

ns 

tFSSR h 

4-17 

Forced Slave Status 

Read Hold 

After CLK R.E. 

2 

33 

2 

25 

2 

20 

ns 

tFSSR { (1) 

4-17 

Forced Slave Status 
Read Floating 

After CLK R.E. 

■ 

30 

■ 

30 

■ 

30 

ns 


4. 4. 2. 5 Input Signal Requirements with all CPUs 





Reference/ 

Conditions 

NS32381- 


Symbol 

Figure 

Description 

15 

20 

25 

30 

Units 





Max 

D3 

Max 

m 

Max 


Max 


tpWR 


Power-On Reset Duration 

After CLK R.E. 



^fi 




30 


/AS 

tRST w 

13 

Reset Pulse Width 

At 0.8V (Both Edges) 

El 


64 


64 


64 




mm 

Reset Setup Time 

Before CLK R.E. 

Q 


D 


□ 


D 


ns 

tRSTh 

mm 

Reset Hold 

After CLK R.E. 

0 


0 


0 


0 


ns 


4.4.2.6 Input Signal Requirements with the NS32008, NS32016, NS32032 CPUs 


Symbol 

Figure 

Description 

Reference/ 

NS32381-15 

NS32381-20 

NS32381-25 

Units 

Conditions 

Min 

Max 

Min 

Max 

Min 

Max 

‘Ss 

4-8 

Status (ST0-ST1) Setup 

Before SPC L.E. 

20 


20 


15 


ns 

*Sh 

4-8 

Status (ST0-ST1) Hold 

After SPCL.E. 

20 


20 


17 


ns 

‘Ds 

mm 

Data Setup (D0-D15) 

Before SPCT.E. 

25 


20 


15 


ns 

l Dh 

mm 

Data Hold (D0-D15) 

After SPCT.E. 

20 


20 


15 


ns 

tSPCw 

4-8 

SPC Pulse Width 
from CPU 

At 0.8V 
(Both Edges) 

35 

■ 

35 

■ 

28 

■ 

ns 


Note 1: Not 100% tested. 
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4.0 Device Specifications (Continued) 

4.4.2.7 Input Signal Requirements with the 32-Bit Slave Protocol NS32332 CPU 


Symbol 

Figure 

Description 

Reference/ 

NS32381-15 

Units 

Conditions 

Min 

Max 

tSTs 

4-11 

Status Setup 

Before SPC L.E. 

20 


ns 

tSTh 

4-11 

Status Hold 

After SPC L.E. 

20 


ns 

— 

4-11 

Data Setup 

Before SPC T.E. 

20 


ns 

■m 

4-11 

Data Hold 

After SPC T.E. 

20 


ns 

tSPCw 

4-11 

SPC Pulse Width 

At 0.8V (Both Edges) 

35 


ns 


4.4.2.8 Input Signal Requirements with the 32-Bit Slave Protocol NS32532 CPU 


Symbol 

Figure 

Description 

Reference/ 

Conditions 

NS32381 

Units 

20 

25 

30 

Min 

Max 

Min 

Max 

Min 

Max 

tSTg 

4-15 

Status Setup 

Before CLK (T2) R.E. 

25 


20 


20 


ns 

t ST h 

4-15 

Status Hold 

After CLK(T2)R.E. 

20 


10 


10 


ns 

tDDINg 

4-15 

Data Direction In Setup 

Before SPC L.E. 

0 


0 


0 


ns 

tDDINh 

4-15 

Data Direction In Hold 

After SPC T.E. 

10 


10 


10 


ns 


4-15 

Data Setup 

Before SPC T.E. 

6 


6 


4 


ns 

l D h 

4-15 

Data Hold 

After SPC T.E. 

20 


10 


10 


ns 

l SPC s 

4-14,15 

SPC Setup 

Before CLK R.E. 

20 


20 


20 


ns 

‘SPCh 

4-14, 15 

SPC Hold 

After CLK R.E. 

0 


0 


0 


ns 


4.4.2.9 Clocking Requirements with all CPUs 


Symbol 

Figure 

Description 

Reference/ 

Conditions 

NS32381 

Units 

15 

20 

25 

30 

Min 

Max 


Max 


Max 

Min 

Max 

tCLKh 

mm 

Clock High Time 



1000 


1000 

H 

1000 

13 

1000 

ns 

tCLK| 

■ 

Clock Low Time 

At 0.8V (Both Edges) 

1 

DC 


DC 

□ 

DC 

13 

DC 

ns 

tCT r (1) 

MSM 

Clock Rise Time 

Between 0.8V and 2.0V 


7 


5 


4 


3 

ns 

tCT d (1) 

EB 

Clock Fall Time 

Between 2.0V and 0.8V 


7 


5 


4 


3 

ns 


S3 

Clock Period 

CLK R.E. to Next CLK R.E. 

66 

DC 

50 

_5Ej 


DC 

33.3 

DC 

ns 


Note 1: Not 100% tested. 
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4.0 Device Specifications (Continued) 

4.4.3 Timing Diagrams 



FIGURE 4-5. Clock Timing 


TL/EE/91 57-21 



TL/EE/91 57-22 

FIGURE 4-6. Power-On Reset 


“_n_n_nji_n_ 



TL/EE/91 57-23 

FIGURE 4-7. Non-Power-On Reset 



TL/EE/9157-24 

FIGURE 4-8. RST Release Timing 
Note: The rising edge of RST must occur while CLK is high, as shown. 



TL/EE/91 57-25 


FIGURE 4-9. Read Cycle from FPU (NS32008, NS32016, NS32032 CPUs) 
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4.0 Device Specifications (Continued) 



TL/EE/91 57-26 

FIGURE 4-10. Write Cycle to FPU (NS32008, NS32016, NS32032 CPUs) 



FIGURE 4-11. Read Cycle from FPU (NS32332 CPU) 
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4.0 Device Specifications (Continued) 



TL/EE/91 57-29 



TL/EE/91 57-30 


FIGURE 4-14. SDN332 (TRAP) Timing (NS32332 CPU) 
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4.0 Device Specifications (Continued) 

[ 


CLK 


ST0-ST3 



j- Tt 

-• T2 -j 











‘STs' 


7777777/7/ 


DOIN 


SPC 
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FIGURE 4-16. Write Cycle to FPU (NS32532 CPU) 
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FIGURE 4-17. SDN532 Timing (NS32532 CPU) 


CLK 



"L_r 

~u 



VsSRa - ► 


— H 

f* VsSRh 

— VsSRf 

FSSR 


t 3 

Y 


FIGURE 4-18. FSSR Timing (NS32532 CPU) 



TL/EE/91 57-34 
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Appendix A 

NS32381 PERFORMANCE ANALYSIS 

The following performance numbers were taken from simu- 
lations using the 381 SIMPLE model. The timing terms have 
been designed to provide performance numbers which are 
CPU independent. Numbers were obtained from SIMPLE 
simulations, taking the average execution times using ‘typi- 
cal’ operands. 

Listed below are definitions of the timing terms: 

EXT — (Execution Time) This is the time from the last data 
sent to the FPU, until the early DONE is issued. 
(FPU Pipe is empty) 

EDD — (Early Done Delta) This is the time from when the 
early DONE is issued until the execution of the next 
instruction may start. 

Provided that the CPU can transfer the ID/OPCODE and 
any operands to the FPU during the EDD time, the average 
system execution time for an instruction (keeping the FPU 
pipe filled) is: EXT + EDD. 

The system execution time for a single FPU instruction with 
FPU register destination and early done is: EXT plus the 
protocol time. (FPU pipe is initially empty) 


Instruction 

EXT* 

EDD* 

Total* 

LFSR 

any, reg 

5 

8 

13 

MOVF 

any, reg 

5 

6 

11 

MOVL 

any, reg 

5 

8 

13 

MOVif 

any, reg 

5 

45 

50 

MOVFL any, reg 

9 

6 

15 

ADDF 

any, reg 

11 

mm 

mm 

ADDL 

any, reg 

11 

KB 

■91 

SUBF 

any, reg 

11 

mm 

o 

SUBL 

any, reg 

11 

k9; 


MULF 

any, reg 

11 

20 

31 

MULL 

any, reg 

11 

27 

38 

DIVF 

any, reg 

11 

45 

56 

DIVL 

any, reg 

11 

59 

70 

POLYF any, any 

15 

46 

61 

POLYL any, any 

15 

53 

68 

DOTF 

any, any 

15 

46 

61 

DOTL 

any, any 

15 

53 

68 


•Measured in the number of clock cycles. 


NS32381 PERFORMANCE ANALYSIS 

The following instructions do not generate an early done. In 
this case, EXT is the time from the last data sent to the FPU, 
until the normal DONE is issued. (FPU Pipe is empty) 


Instruction 

EXT 

SFSR 

reg, mem 

7 

MOVLF 

any, any 

18 

ROUNDfi any, mem 

46 

FLOORfi any, mem 

46 

TRUNCfi any, mem 

46 

CMPF 

any, any 

17 

CMPL 

any, any 

17 

ABSf 

any, any 

9 

NEGf 

any, any 

9 

SCALBf 

any, any 

49 

LOGBf 

any, any 

36 
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National 

Semiconductor 


NS32081-10/NS32081-15 Floating-Point Units 


General Description 

The NS32081 Floating-Point Unit functions as a slave proc- 
essor in National Semiconductor’s Series 32000® micro- 
processor family. It provides a high-speed floating-point in- 
struction set for any Series 32000 famiiy CPU, while remain- 
ing architecturally consistent with the full two-address archi- 
tecture and powerful addressing modes of the Series 32000 
micro-processor family. 


Features 

■ Eight on-chip data registers 

■ 32-bit and 64-bit operations 

■ Supports proposed IEEE standard for binary floating- 
point arithmetic, Task P754 

■ Directly compatible with NS32016, NS32008 and 
NS32032 CPUs 

■ High-speed XMOStm technology 

■ Single 5V supply 

■ 24-pin dual in-line package 


Block Diagram 


r 


CONTROL UNIT! 



TL/EE/5234-1 


3-32 
















Table of Contents 


1.0 PRODUCT INTRODUCTION 

1.1 Operand Formats 

1.1.1 Normalized Numbers 

1.1.2 Zero 

1.1.3 Reserved Operands 

1.1.4 Integers 

1.1.5 Memory Representations 

2.0 ARCHITECTURAL DESCRIPTION 

2.1 Programming Model 

2.1.1 Floating-Point Registers 

2.1.2 Floating-Point Status Register (FSR) 

2.1. 2.1 FSR Mode Control Fields 

2. 1.2.2 FSR Status Fields 

2. 1.2.3 FSR Software Field (SWF) 

2.2 Instruction Set 

2.2.1 General Instruction Format 

2.2.2 Addressing Modes 

2.2.3 Floating-Point Instruction Set 

2.3 Traps 

3.0 FUNCTIONAL DESCRIPTION 

3.1 Power and Grounding 

3.2 Clocking 

3.3 Resetting 


3 


3.0 FUNCTIONAL DESCRIPTION (Continued) 

3.4 Bus Operation 

3.4.1 Bus Cycles 

3.4.2 Operand Transfer Sequences 

3.5 Instruction Protocols 

3.5.1 General Protocol Sequence 

3.5.2 Floating-Point Protocols 

4.0 DEVICE SPECIFICATIONS 

4.1 Pin Descriptions 

4.1.1 Supplies 

4.1.2 Input Signals 

4.1.3 Input/Output Signals 

4.2 Absolute Maximum Ratings 

4.3 Electrical Characteristics 

4.4 Switching Characteristics 

4.4.1 Definitions 

4.4.2 Timing Tables 

4.4.2.1 Output Signals: Internal Propagation De- 
lays 

4.4.2.2 Input Signals Requirements 

4.4.2.S Clocking Requirements 

4.4.3 Timing Diagrams 


3-33 


NS32081-10/NS32081-15 






NS32081-10/NS32081-15 


List of Illustrations 

Floating-Point Operand Formats 1-1 

Register Set 2-1 

The Floating-Point Status Register 2-2 

General Instruction Format 2-3 

Index Byte Format 2-4 

Displacement Encodings 2-5 

Floating-Point Instruction Formats 2-6 

Recommended Supply Connections 3-1 

Power-On Reset Requirements 3-2 

General Reset Timing 3-3 

System Connection Diagram 3-4 

Slave Processor Read Cycle 3-5 

Slave Processor Write Cycle 3-6 

FPU Protocol Status Word Format 3-7 

Dual-ln-Line Package 4-1 

Timing Specification Standard (Signal Valid After Clock Edge) 4-2 

Timing Specification Standard (Signal Valid Before Clock Edge) 4-3 

Clock Timing 4-4 

Power-On-Reset 4-5 

Non-Power-On-Reset 4-6 

Read Cycle From FPU 4-7 

Write Cycle To FPU 4-8 

SPC Pulse from FPU 4-9 

RST Release Timing 4-10 

List of Tables 

Sample F Fields 1-1 

Sample E Fields 1-2 

Normalized Number Ranges 1-3 

Series 32000 Family Addressing Modes 2-1 

General Instruction Protocol 3-1 

Floating-Point Instruction Protocols 3-2 


3-34 



1.0 Product Introduction 

The NS32081 Floating-Point Unit (FPU) provides high 
speed floating-point operations for the Series 32000 family, 
and is fabricated using National high-speed XMOS technol- 
ogy. It operates as a slave processor for transparent expan- 
sion of the Series 32000 CPU’s basic instruction set. The 
FPU can also be used with other microprocessors as a pe- 
ripheral device by using additional TTL interface logic. The 
NS32081 is compatible with the IEEE Floating-Point For- 
mats by means of its hardware and software features. 

1.1 OPERAND FORMATS 

The NS32081 FPU operates on two floating-point data 
types — single precision (32 bits) and double precision (64 
bits). Floating-point instruction mnemonics use the suffix F 
(Floating) to select the single precision data type, and the 
suffix L (Long Floating) to select the double precision data 
type. 

A floating-point number is divided into three fields, as shown 
in Figure 1-1. 

The F field is the fractional portion of the represented num- 
ber. In Normalized numbers (Section 1.1.1), the binary point 
is assumed to be immediately to the left of the most signifi- 
cant bit of the F field, with an implied 1 bit to the left of the 
binary point. Thus, the F field represents values in the range 
1.0 ^ x ^ 2.0. 

TABLE 1-1. Sample F Fields 
F Field Binary Value Decimal Value 

000 ... 0 1 .000 ... 0 1 .000 ... 0 

010.. . 0 1.010...0 1.250...0 

100.. . 0 1.100...0 1.500... 0 

110. . .0 1.110. ..0 1.750. ..0 

T 

Implied Bit 

The E field contains an unsigned number that gives the bi- 
nary exponent of the represented number. The value in the 
E field is biased; that is, a constant bias value must be sub- 
tracted from the E field value in order to obtain the true 
exponent. The bias value is 01 1 . . . 1 12 . which is either 127 
(single precision) or 1023 (double precision). Thus, the true 
exponent can be either positive or negative, as shown in 
Table 1-2. 


TABLE 1-2. Sample E Fields 


E Field 

F Field 

Represented Value 

011 .. 

, .110 

100. . 

, . 0 

1.5X2-1 = 0.75 

011 .. 

, .111 

100. . 

..0 

1.5X20 = 1.50 

100.. 

,.000 

100.. 

, . 0 

1.5X21 = 3.00 


Two values of the E field are not exponents. 1 1 ... 1 1 sig- 
nals a reserved operand (Section 2.1.3). 00... 00 repre- 
sents the number zero if the F field is also all zeroes, other- 
wise it signals a reserved operand. 

The S bit indicates the sign of the operand. It is 0 for posi- 
tive and 1 for negative. Floating-point numbers are in sign- 
magnitude form, that is, only the S bit is complemented in 
order to change the sign of the represented number. 

1.1.1 Normalized Numbers 

Normalized numbers are numbers which can be expressed 
as floating-point operands, as described above, where the E 
field is neither all zeroes nor all ones. 

The value of a Normalized number can be derived by the 
formula: 

(-I)S X 2 (E-Bias) X (1 + F) 

The range of Normalized numbers is given in Table 1-3. 

1.1.2 Zero 

There are two representations for zero — positive and nega- 
tive. Positive zero has all-zero F and E fields, and the S bit is 
zero. Negative zero also has all-zero F and E fields, but its S 
bit is one. 

1.1.3 Reserved Operands 

The proposed IEEE Standard for Binary Floating-Point Arith- 
metic (Task P754) provides for certain exceptional forms of 
floating-point operands. The NS32081 FPU treats these 
forms as reserved operands. The reserved operands are: 

• Positive and negative infinity 

• Not-a-Number (NaN) values 

• Denormalized numbers 

Both Infinity and NaN values have all ones in their E fields. 
Denormalized numbers have all zeroes in their E fields and 
non-zero values in their F fields. 

The NS32081 FPU causes an Invalid Operation trap (Sec- 
tion 2.1. 2. 2) if it receives a reserved operand, unless the 
operation is simply a move (without conversion). The FPU 
does not generate reserved operands as results. 


Single Precision 
31 30 23 22 


Double Precision 


63 62 52 51 


FIGURE 1-1. Floating-Point Operand Formats 


3-35 


NS32081-10/NS32081-15 



NS32081-10/NS32081-15 


1.0 Product Introduction (Continued) 


TABLE 1-3. Normalized Number Ranges 
Single Precision 

Most Positive 2 127 x(2-2 -23 ) 

= 3.40282346X 10 38 

Least Positive 2 -126 

= 1.17549436X10- 33 


Least Negative 


Most Negative 


-(2-126) 

= -1.17549436X10-38 
— 2127x(2 — 2 — 23 ) 

= -3.40282346X1038 


Double Precision 

21 023 X (2 — 2 — 82) 

= 1 .7976931 3486231 57 X 1 0308 
2-1022 

= 2.225073858507201 4 X 1 0 ~ 308 
-( 2 - 1022 ) 

= - 2.2250738585072014X10-308 
— 21 023 x (2 — 2—82) 

= -1.7976931348623157X10308 


Note: The values given are extended one full digit beyond their represented accuracy to help in generating rounding and conversion algorithms. 


1.1.4 Integers 

In addition to performing floating-point arithmetic, the 
NS32081 FPU performs conversions between integer and 
floating-point data types. Integers are accepted or generat- 
ed by the FPU as two’s complement values of byte (8 bits), 
word (16 bits) or double word (32 bits) length. 

1.1.5 Memory Representations 

The NS32081 FPU does not directly access memory. How- 
ever, it is cooperatively involved in the execution of a set of 
two-address instructions with its Series 32000 Family CPU. 
The CPU determines the representation of operands in 
memory. 

In the Series 32000 family of CPUs, operands are stored in 
memory with the least significant byte at the lowest byte 
address. The only exception to this rule is the Immediate 
addressing mode, where the operand is held (within the in- 
struction format) with the most significant byte at the lowest 
address. 

2.0 Architectural Description 

2.1 PROGRAMMING MODEL 

The Series 32000 architecture includes nine registers that 
are implemented on the NS32081 Floating-Point Unit (FPU). 

DEDICATED DATA 

32 ► ^ 32 » 

I FSR | FOI I 


2.1.1 Floating-Point Registers 

There are eight registers (F0-F7) on the NS32081 FPU for 
providing high-speed access to floating-point operands. 
Each is 32 bits long. A floating-point register is referenced 
whenever a floating-point instruction uses the Register ad- 
dressing mode (Section 2.2.2) for a floating-point operand. 
All other Register mode usages (i.e., integer operands) refer 
to the General Purpose Registers (R0-R7) of the CPU, and 
the FPU transfers the operand as if it were in memory. 
When the Register addressing mode is specified for a dou- 
ble precision (64-bit) operand, a pair of registers holds the 
operand. The programmer must specify the even register of 
the pair. The even register contains the least significant half 
of the operand and the next consecutive register contains 
the most significant half. 

2.1.2 Floating-Point Status Register (FSR) 

The Floating-Point Status Register (FSR) selects operating 
modes and records any exceptional conditions encountered 
during execution of a floating-point operation. Figure 2-2 
shows the format of the FSR. 


31 16 15 


9 8 7 6 5 4 3 2 1 0 

I rm | if IienI uf IuenI TT 


FIGURE 2-1. Register Set 


FIGURE 2-2. The Floating-Point Status Register 
2.1.2.1 FSR Mode Control Fields 

The FSR mode control fields select FPU operation modes. 
The meanings of the FSR mode control bits are given be- 
low. 

Rounding Mode (RM): Bits 7 and 8. This field selects the 
rounding method. Floating-point results are rounded when- 
ever they cannot be exactly represented. The rounding 
modes are: 

00 Round to nearest value. The value which is nearest to 
the exact result is returned. If the result is exactly half- 
way between the two nearest values the even value 
(LSB = 0) is returned. 

01 Round toward zero. The nearest value which is closer to 
zero or equal to the exact result is returned. 
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10 Round toward positive infinity. The nearest value which 
is greater than or equal to the exact result is returned. 

1 1 Round toward negative infinity. The nearest value which 
is less than or equal to the exact result is returned. 

Underflow Trap Enable (UEN): Bit 3. If this bit is set, the 
FPU requests a trap whenever a result is too small in abso- 
lute value to be represented as a normalized number. If it is 
not set, any underflow condition returns a result of exactly 
zero. 

Inexact Result Trap Enable (IEN): Bit 5. If this bit is set, 
the FPU requests a trap whenever the result of an operation 
cannot be represented exactly in the operand format of the 
destination. If it is not set, the result is rounded according to 
the selected rounding mode. 

2. 1.2.2 FSR Status Fields 

The FSR Status Fields record exceptional conditions en- 
countered during floating-point data processing. The mean- 
ings of the FSR status bits are given below: 

Trap Type (TT): bits 0-2. This 3-bit field records any excep- 
tional condition detected by a floating-point instruction. The 
TT field is loaded with zero whenever any floating-point in- 
struction except LFSR or SFSR completes without encoun- 
tering an exceptional condition. It is also set to zero by a 
hardware reset or by writing zero into it with the Load FSR 
(LFSR) instruction. Underflow and Inexact Result are always 
reported in the TT field, regardless of the settings of the 
UEN and IEN bits. 

000 No exceptional condition occurred. 

001 Underflow. A non-zero floating-point result is too small 
in magnitude to be represented as a normalized float- 
ing-point number in the format of the destination oper- 
and. This condition is always reported in the TT field 
and UF bit, but causes a trap only if the UEN bit is set. If 
the UEN bit is not set, a result of Positive Zero is pro- 
duced, and no trap occurs. 

010 Overflow. A result (either floating-point or integer) of a 
floating-point instruction is too great in magnitude to be 
held in the format of the destination operand. Note that 
rounding, as well as calculations, can cause this condi- 
tion. 

01 1 Divide by zero. An attempt has been made to divide a 
non-zero floating-point number by zero. Dividing zero by 
zero is considered an Invalid Operation instead (below). 


100 Illegal Instruction. Two undefined floating-point instruc- 
tion forms are detected by the FPU as being illegal. The 
binary formats causing this trap are: 

xxxxxxxxxxOOl 1 xxl 0111110 
xxxxxxxxxxl 001 xxl 01 1 1 1 1 0 

101 Invalid Operation. One of the floating-point operands of 
a floating-point instruction is a Reserved operand, or an 
attempt has been made to divide zero by zero using the 
DIVf instruction. 

110 Inexact Result. The result (either floating-point or inte- 
ger) of a floating-point instruction cannot be represent- 
ed exactly in the format of the destination operand, and 
a rounding step must alter it to fit. This condition is al- 
ways reported in the TT field and IF bit unless any other 
exceptional condition has occurred in the same instruc- 
tion. In this case, the TT field always contains the code 
for the other exception and the IF bit is not altered. A 
trap is caused by this condition only if the IEN bit is set; 
otherwise the result is rounded and delivered, and no 
trap occurs. 

1 1 1 (Reserved for future use.) 

Underflow Flag (UF): Bit 4. This bit is set by the FPU when- 
ever a result is too small in absolute value to be represented 
as a normalized number. Its function is not affected by the 
state of the UEN bit. The UF bit is cleared only by writing a 
zero into it with the Load FSR instruction or by a hardware 
reset. 

Inexact Result Flag (IF): Bit 6. This bit is set by the FPU 
whenever the result of an operation must be rounded to fit 
within the destination format. The IF bit is set only if no other 
error has occurred. It is cleared only by writing a zero into it 
with the Load FSR instruction or by a hardware reset. 

2.1.2.3 FSR Software Field (SWF) 

Bits 9-15 of the FSR hold and display any information writ- 
ten to them (using the LFSR and SFSR instructions), but are 
not otherwise used by FPU hardware. They are reserved for 
use with NSC floating-point extension software. 

2.2 INSTRUCTION SET 

2.2.1 General Instruction Format 

Figure 2-3 shows the general format of an Series 32000 
instruction. The Basic Instruction is one to three bytes long 


OPTIONAL BASIC 

EXTENSIONS INSTRUCTION 
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FIGURE 2-3. General Instruction Format 
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2.0 Architectural Description (Continued) 

and contains the opcode and up to two 5-bit General Ad- 
dressing Mode (Gen) fields. Following the Basic Instruction 
field is a set of optional extensions, which may appear de- 
pending on the instruction and the addressing modes se- 
lected. 

The only form of extension issued to the NS32081 FPU is 
an Immediate operand. Other extensions are used only by 
the CPU to reference memory operands needed by the 
FPU. 

Index Bytes appear when either or both Gen fields specify 
Scaled Index. In this case, the Gen field specifies only the 
Scale Factor (1, 2, 4 or 8), and the Index Byte specifies 
which General Purpose Register to use as the index, and 
which addressing mode calculation to perform before index- 
ing. See Figure 2-4. 

Following Index Bytes come any displacements (addressing 
constants) or immediate values associated with the select- 
ed addressing modes. Each Disp/lmm field may contain 
one or two displacements, or one immediate value. The size 
of a Displacement field is encoded within the top bits of that 
field, as shown in Figure 2-5, with the remaining bits inter- 
preted as a signed (two’s complement) value. The size of an 
immediate value is determined from the Opcode field. Both 
Displacement and Immediate fields are stored most signifi- 
cant byte first. 

Some non-FPU instructions require additional, “implied” im- 
mediates and/or displacements, apart from those associat- 
ed with addressing modes. Any such extensions appear at 
the end of the instruction, in the order that they appear with- 
in the list of operands in the instruction definition. 

2.2.2 Addressing Modes 

The Series 32000 Family CPUs generally access an oper- 
and by calculating its Effective Address based on informa- 
tion available when the operand is to be accessed. The 
method to be used in performing this calculation is specified 
by the programmer as an “addressing mode.” 

Addressing modes in the Series 32000 family are designed 
to optimally support high-level language accesses to vari- 
ables. In nearly all cases, a variable access requires only 
one addressing mode within the instruction which acts upon 
that variable. Extraneous data movement is therefore mini- 
mized. 

Series 32000 Addressing Modes fall into nine basic types: 
Register: In floating-point instructions, these addressing 
modes refer to a Floating-Point Register (F0-F7) if the op- 
erand is of a floating-point type. Otherwise, a CPU General 
Purpose Register (R0-R7) is referenced. See Section 2.1.1. 
Register Relative: A CPU General Purpose Register con- 
tains an address to which is added a displacement value 
from the instruction, yielding the Effective Address of the 
operand in memory. 


7 3 

2 0 

GEN. ADDR. MODE 

REG. NO. 
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Memory Space: Identical to Register Relative above, ex- 
cept that the register used is one of the dedicated CPU 
registers PC, SP, SB or FP. These registers point to data 
areas generally needed by high-level languages. 

Memory Relative: A pointer variable is found within the 
memory space pointed to by the CPU SP, SB or FP register. 
A displacement is added to that pointer to generate the Ef- 
fective Address of the operand. 

Immediate: The operand is encoded within the instruction. 
This addressing mode is not allowed if the operand is to be 
written. Floating-point operands as well as integer operands 
may be specified using Immediate mode. 

Absolute: The address of the operand is specified by a 
Displacement field in the instruction. 

External: A pointer value is read from a specified entry of 
the current Link Table. To this pointer value is added a dis- 
placement, yielding the Effective Address of the operand. 
Top of Stack: The currently-selected CPU Stack Pointer 
(SP0 or SP1) specifies the location of the operand. The op- 
erand is pushed or popped, depending on whether it is writ- 
ten or read. 

Scaled Index: Although encoded as an addressing mode, 
Scaled Indexing is an option on any addressing mode ex- 
cept Immediate or another Scaled Index. It has the effect of 
calculating an Effective Address, then multiplying any Gen- 
eral Purpose Register by 1, 2, 4 or 8 and adding it into the 
total, yielding the final Effective Address of the operand. 
The following table, Table 2-1, is a brief summary of the 
addressing modes. For a complete description of their ac- 
tions, see the Series 32000 Instruction Set Reference Man- 
ual. 


0 SIGNED DISPLACEMENT 


1 ■ 0 











FIGURE 2-4. Index Byte Format 
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TABLE 2-1. Series 32000 Family Addressing Modes 

Encoding 

REGISTER 

Mode 

Assembler Syntax 

Effective Address 

00000 

Register 0 

RO or FO 

None: Operand is in the specified register. 

00001 

Register 1 

R1 or FI 


00010 

Register 2 

R2 or F2 


00011 

Register 3 

R3 or F3 


00100 

Register 4 

R4 or F4 


00101 

Register 5 

R5 or F5 


00110 

Register 6 

R6 or F6 


00111 

Register 7 

R7 or F7 


REGISTER RELATIVE 



01000 

Register 0 relative 

disp(RO) 

Disp + Register. 

01001 

Register 1 relative 

disp(RI) 


01010 

Register 2 relative 

disp(R2) 


01011 

Register 3 relative 

disp(R3) 


01100 

Register 4 relative 

disp(R4) 


01101 

Register 5 relative 

disp(R5) 


oiiio 

Register 6 relative 

disp(R6) 


01111 

Register 7 relative 

disp(R7) 


MEMORY SPACE 




11000 

Frame memory 

disp(FP) 

Disp + Register; “SP” is either 

11001 

Stack memory 

disp(SP) 

SPO or SP1 , as selected in PSR. 

11010 

Static memory 

disp(SB) 


11011 

Program memory 

* + disp 


MEMORY RELATIVE 



10000 

Frame memory relative 

disp2(disp1 (FP)) 

Disp2+ Pointer; Pointer found at 

10001 

Stack memory relative 

disp2(disp1(SP)) 

address Displ + Register. "SP” is 

10010 

Static memory relative 

disp2(disp1 (SB)) 

either SPO or SP1 , as selected in PSR. 

IMMEDIATE 




10100 

Immediate 

value 

None: Operand is issued from 

CPU instruction queue. 

ABSOLUTE 




10101 

Absolute 

@disp 

Disp. 

EXTERNAL 




10110 

External 

EXT (disp1) + disp2 

Disp2 + Pointer; Pointer is found 
at Link Table Entry number Displ . 

TOP OF STACK 




10111 

Top of Stack 

TOS 

Top of current stack, using either 

User or Interrupt Stack Pointer, 
as selected in PSR. Automatic 

Push/Pop included. 

SCALED INDEX 




11100 

Index, bytes 

mode[Rn:B] 

Mode + Rn. 

11101 

Index, words 

mode[Rn:W] 

Mode + 2 X Rn. 

11110 

Index, double words 

mode[Rn:D] 

Mode + 4 X Rn. 

11111 

Index, quad words 

mode[Rn:Q] 

Mode + 8 x Rn. 

"Mode” and “n” are contained 
within the Index Byte. 

10011 

(Reserved for Future Use) 
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2.0 Architectural Description (Continued) 

2.2.3 Floating-Point Instruction Set 

The NS32081 FPL) instructions occupy formats 9 and 1 1 of 
the Series 32000 Family instruction set (Figure 2-6 ). A list 
of all Series 32000 family instruction formats is found in the 
applicable CPU data sheet. 

Certain notations in the following instruction description ta- 
bles serve to relate the assembly language form of each 
instruction to its binary format in Figure 2-6. 

Format 9 


23 

16|l5 



B 

7 0 

1 1 1 1 
gent 

Mil 

gen2 

III 

1 op 


Li 

~i n i i i i | 
0 0 1 1 1 1 1 0 1 



OPERATION WORD ID BYTE 
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Format 1 1 


23 

16 [ 1 5 


8 

7 

0 

II 1 1 "| 
genl 

1 — 1 i l l 

1 gen2 

1 1 II 
22 1 

ill 

1 1 1 1 
10 11 

i" i r 
1110 


OPERATION WORD 10 BYTE 
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FIGURE 2-6. Floating-Point Instruction Formats 


The Format column indicates which of the two formats in 
Figure 2-6 represents each instruction. 

The Op column indicates the binary pattern for the field 
called “op” in the applicable format. 

The Instruction column gives the form of each instruction as 
it appears in assembly language. The form consists of an 
instruction mnemonic in upper case, with one or more suffix- 
es (i or f) indicating data types, followed by a list of oper- 
ands (genl, gen2). 

An i suffix on an instruction mnemonic indicates a choice of 
integer data types. This choice affects the binary pattern in 
the i field of the corresponding instruction format (Figure 2-6 ) 
as follows: 


Suffix 1 

Data Type 

i Field 

B 

Byte 

00 

W 

Word 

01 

D 

Double Word 

11 


An f suffix on an instruction mnemonic indicates a choice of 
floating-point data types. This choice affects the setting of 
the f bit of the corresponding instruction format (Figure 2-6) 
as follows: 


Movement and Conversion 

The following instructions move the genl operand to the 
gen2 operand, leaving the genl operand intact. 


Format Op Instruction 

11 0001 MOVf gen1,gen2 

9 010 MOVLF gen1,gen2 

9 011 MOVFL gen1,gen2 


9 000 MOVif gen1,gen2 


9 100 ROUNDfi gen1,gen2 


9 101 TRUNCfi gen1,gen2 


9 111 FLOORfi gen1,gen2 


Description 

Move without 
conversion 
Move, converting 
from double 
precision to 
single precision. 
Move, converting 
from single 
precision to 
double 
precision. 

Move, converting 
from any integer 
type to any 
floating-point 
type. 

Move, converting 
from floating- 
point to the 
nearest integer. 
Move, converting 
from floating- 
point to the 
nearest integer 
closer to zero. 
Move, converting 
from floating- 
point to the 
largest integer 
less than or 
equal to its 
value. 


Note: The MOVLF Instruction t bit must be 1 and the I field must be 10. 
The MOVFL Instruction f bit must be 0 and the I field must be 11. 


Arithmetic Operations 

The following instructions perform floating-point arithmetic 
operations on the genl and gen2 operands, leaving the re- 
sult in the gen2 operand. 


Suffix f Data Type f Bit 

F Single Precision 1 

L Double Precision (Long) 0 

An operand designation (genl, gen2) indicates a choice of 
addressing mode expressions. This choice affects the bina- 
ry pattern in the corresponding genl or gen2 field of the 
instruction format (Figure 2-6). Refer to Table 2-1 for the 
options available and their patterns. 

Further details of the exact operations performed by each 
instruction are found in the Series 32000 Instruction Set 
Reference Manual. 


Format 

Op 

Instruction 

Description 

11 

0000 

ADDf 

gen1,gen2 

Add genl to gen2. 

11 

0100 

SUBf 

genl, gen2 

Subtract genl 
from gen2. 

11 

1100 

MULf 

genl, gen2 

Multiply gen2 by 
genl. 

11 

1000 

DIVf 

genl, gen2 

Divide gen2 by 
genl. 

11 

0101 

NEGf 

genl, gen2 

Move negative of 
genl to gen2. 

11 

1101 

ABSf 

genl, gen2 

Move absolute 


value of genl to 
gen2. 
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Comparison 

The Compare instruction compares two floating-point val- 
ues, sending the result to the CPU PSR Z and N bits for use 
as condition codes. See Figure 3-7. The Z bit is set if the 
genl and gen2 operands are equal; it is cleared otherwise. 
The N bit is set if the genl operand is greater than the gen2 
operand; it is cleared otherwise. The CPU PSR L bit is un- 
conditionally cleared. Positive and negative zero are consid- 
ered equal. 

Format Op Instruction Description 

11 0010 CMPf gen1,gen2 Compare genl 

to gen2. 

Floating-Point Status Register Access 

The following instructions load and store the FSR as a 32- 
bit integer. 

Format Op Instruction Description 

9 001 LFSR genl Load FSR 

9 110 SFSR gen2 Store FSR 

2.3 TRAPS 

Upon detecting an exceptional condition in executing a 
floating-point instruction, the NS32081 FPU requests a trap 
by setting the Q bit of the status word transferred during the 
slave protocol (Section 3.5). The CPU responds by perform- 
ing a trap using a default vector value of 3. See the Series 
32000 Instruction Set Reference Manual and the applicable 
CPU data sheet for trap service details. 

A trapped floating-point instruction returns no result, and 
does not affect the CPU Processor Status Register (PSR). 
The FPU displays the reason for the trap in the Trap Type 
(TT) field of the FSR (Section 2.1. 2.2). 

3.0 Functional Description 

3.1 POWER AND GROUNDING 

The NS32081 requires a single 5V power supply, applied on 
pin 24 (Vcc)- See DC Electrical Characteristics table. 
Grounding connections are made on two pins. Logic Ground 
(GNDL, pin 12) is the common pin for on-chip logic, and 
Buffer Ground (GNDB, pin 13) is the common pin for the 
output drivers. For optimal noise immunity, it is recommend- 
ed that GNDL be attached through a single conductor di- 
rectly to GNDB, and that all other grounding connections be 
made only to GNDB, as shown below ( Figure 3-1). 



FIGURE 3-1. Recommended Supply Connections 


3.2 CLOCKING 

The NS32081 FPU requires a single-phase TTL clock input 
on its CLK pin (pin 14). When the FPU is connected to a 
Series 32000 CPU, the CLK signal is provided from the 
CTTL pin of the NS32201 Timing Control Unit. 

3.3 RESETTING 

The RST pin serves as a reset for on-c hip lo gic. The FPU 
may be reset at any time by pulling the RST pin low for at 
least 64 clock cycles. Upon detecting a reset, the FPU ter- 
minates instruction processing, resets its internal logic, and 
clears the FSR to all zeroes. 

On application of power, RST must be held low for at least 
50 p.s after Vcc ls stable. This ensures that ail on-chip volt- 
ages are completely stable before operation. See Figures 3-2 
and 3-3. 
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FIGURE 3-2. Power-On Reset Requirements 
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FIGURE 3-3. General Reset Timing 


3.4 BUS OPERATION 


Instructions and operands are passed to the NS32081 FPU 
with slave processor bus cycles. Each bus cycle transfers 
either one byte (8 bits) or one w ord (1 6 bits) to or from the 
FPU. During all bus cycles, the SPC line is driven by the 
CPU as an active low data strobe, and the FPU monitors 



TL/EE/5234-2 

FIGURE 3-4. System Connection Diagram 
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3.5 INSTRUCTION PROTOCOLS 


3.0 Functional Description (Continued) 

pins STO and ST1 to keep track of the sequence (protocol) 
established for the instruction being executed. This is nec- 
essary in a virtual memory environment, allowing the FPU to 
retry an aborted instruction. 

3.4.1 Bus Cycles 

A bus cycle is initiated by the CPU, which asserts the proper 
status on STO and ST1 and pulses SPC low. STO and ST1 
are s ampled by the FPU on the leading (falling) edge of the 
SPC pulse. If the transfer is from the FPU (a slave processor 
read cycle), the FPU asserts data on the data bus for the 
duration of the SPC pulse. If the transfer is to the FPU (a 
slave processor write cycle), the FPU latch es da ta from the 
data bus on the trailing (rising) edge of the SPC pulse. Fig- 
ures 3-5 and 3-6 illustrate these sequences. 

The direction of the transfer and the role of the bidirectional 
SPC line ar e de termined by the instruction protocol being 
performed. SPC is always driven by the CPU during slave 
processor bus cycles. Protocol sequences for each instruc- 
tion are given in Section 3.5. 

3.4.2 Operand Transfer Sequences 

An operand is transferred in one or more bus cycles. A 1- 
byte operand is transferred on the least significant byte of 
the data bus (D0-D7). A 2-byte operand is transferred on 
the entire bus. A 4-byte or 8-byte operand is transferred in 
consecutive bus cycles, least significant word first. 


3.5.1 General Protocol Sequence 

Slave Processor instructions have a three-byte Basic In- 
struction field, consisting of an ID byte followed by an Oper- 
ation Word. See Section 2.2.3 for FPU instruction encod- 
ings. The ID Byte has three functions: 

1) It identifies the instruction to the CPU as being a Slave 
Processor instruction. 

2) It specifies which Slave Processor will execute it. 

3) It determines the format of the following Operation Word 
of the instruction. 

Upon receiving a Slave Processor instruction, the CPU initi- 
ates the sequence outlined in Table 3-2. While applying 
Status Code 11 (Broadcast ID. Table 3-1), the CPU trans- 
fers the ID Byte on the least significant half of the Data Bus 
(D0-D7). All Slave Processors input this byte and decode it. 
The Slave Processor selected by the ID Byte is activated, 
and from this point the CPU is communicating only with it. If 
any other slave protocol was in progress (e.g., an aborted 
Slave instruction), this transfer cancels it. 

The CPU next sends the Operation Word while applying 
Status Code 01 (Transfer Slave Operand, Table 3-1). Upon 
receiving it, the FPU decodes it, and at this point both the 
CPU and the FPU are aware of the number of operands to 
be transferred and their sizes. The Operation Word is 
swapped on the Data Bus; that is, bits 0-7 appear on pins 
D8-D15, and bits 8-15 appear on pins D0-D7. 



Note 1: FPU samples CPU status here. 

FIGURE 3-5. Slave Processor Read Cycle 
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Note 1: FPU samples CPU status here. 

Note 2: FPU samples data bus here. 

FIGURE 3-6. Slave Processor Write Cycle 
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3.0 Functional Description (Continued) 

Using the Addressing Mode fields within the Operation 
Word, the CPU starts fetching operands and issuing them to 
the FPU. To do so, it references any Addressing Mode ex- 
tensions appended to the FPU instruction. Since the CPU is 
solely responsible for memory accesses, these extensions 
are not sent to the Slave Processor. The Status Code ap- 
plied is 01 (Transfer Slave Processor Operand, Table 3-1). 
After the CPU has issued the last operand, the FPU starts 
the actual execution of the in struc tion. Upon completion, it 
will signal the CPU by p ulsing SPC low. To allow f or thi s, the 
CPU releases the SPC signal, causing it to float. SPC must 
be held high by an external pull-up resistor. 

Upon receiving the pulse on SPC, the CPU uses SPC to 
read a Status Word from the FPU, applying Status Code 10. 
This word has the format shown in Figure 3-7. If the Q bit 
(“Quit”, Bit 0) is set, this indicates that an error has been 
detected by the FPU. The CPU will not continue the proto- 
col, but will immediately trap through the Slave vector in the 
Interrupt Table. If the instruction being performed is CMPf 
(Section 2.2.3) and the Q bit is not set, the CPU loads Proc- 
essor Status Register (PSR) bits N, Z and L from the corre- 
sponding bits in the Status Word. The NS32081 FPU always 
sets the L bit to zero. 


15 8 7 o 


00000000 

N Z 0 0 0 L 0 Q 

NEW PSR BIT VALUE(S] 

^ ( 


“QUIT": TERMINATE PROTOCOL, TRAP (FPU). 

TL/EE/5234-18 

FIGURE 3-7. FPU Protocol Status Word Format 

The last step in the protocol is for the CPU to read a result, 
if any, and transfer it to the destination. The Read cycles 
from the FPU are performed by the CPU while applying 
Status Code 01 (Section 4.1.2). 


TABLE 3-1. General Instruction Protocol 


Step 

Status 

Action 

1 

11 

CPU sends ID Byte. 

2 

01 

CPU sends Operation Word. 

3 

01 

CPU sends required operands. 

4 

XX 

FPU starts execution. 

5 

XX 

FPU pulses SPC low. 

6 

10 

CPU reads Status Word. 

7 

01 

CPU reads result (if any). 


3.5.2 Floating-Point Protocols 

Table 3-2 gives the protocols followed for each floating- 
point instruction. The instructions are referenced by their 
mnemonics. For the bit encodings of each instruction, see 
Section 2.2.3. 

The Operand Class columns give the Access Classes for 
each general operand, defining how the addressing modes 
are interpreted by the CPU (see Series 32000 Instruction 
Set Reference Manual). 

The Operand Issued columns show the sizes of the oper- 
ands issued to the Floating-Point Unit by the CPU. “D” indi- 
cates a 32-bit Double Word, “i” indicates that the instruction 
specifies an integer size for the operand (B = Byte, W = 
Word, D = Double Word), “f” indicates that the instruction 
specifies a floating-point size for the operand (F = 32-bit 
Standard Floating, L = 64-bit Long Floating). 

The Returned Value Type and Destination column gives the 
size of any returned value and where the CPU places it. The 
PSR Bits Affected column indicates which PSR bits, if any, 
are updated from the Slave Processor Status Word (Figure 
3-7). 

Any operand indicated as being of type “f” will not cause a 
transfer if the Register addressing mode is specified, be- 
cause the Floating-Point Registers are physically on the 
Floating-Point Unit and are therefore available without CPU 
assistance. 


TABLE 3-2. Floating Point Instruction Protocols 


Mnemonic 

Operand 1 

Operand 2 

Operand 1 

Operand 2 

Returned Value 

PSR Bits 

Class 

Class 

Issued 

Issued 

Type and Dest. 

Affected 

ADDf 

read.f 

rmw.f 

f 

f 

f to Op. 2 

none 

SUBf 

read.f 

rmw.f 

f 

f 

f to Op. 2 

none 

MULf 

read.f 

rmw.f 

f 

f 

f to Op. 2 

none 

DIVf 

read.f 

rmw.f 

f 

f 

f to Op. 2 

none 

MOVf 

read.f 

write, f 

f 

N/A 

f to Op. 2 

none 

ABSf 

read.f 

write.f 

f 

N/A 

f to Op. 2 

none 

NEGf 

read.f 

write, f 

f 

N/A 

f to Op. 2 

none 

CMPf 

read.f 

read.f 

f 

f 

N/A 

N,Z,L 

FLOORfi 

read.f 

write.i 

f 

N/A 

i to Op. 2 

none 

TRUNCfi 

read.f 

write. i 

f 

N/A 

i to Op. 2 

none 

ROUNDfi 

read.f 

write.i 

f 

N/A 

i to Op. 2 

none 

MOVFL 

read.F 

write.L 

F 

N/A 

L to Op. 2 

none 

MOVLF 

read.L 

write.F 

L 

N/A 

F to Op. 2 

none 

MOVif 

read.i 

write.f 

i 

N/A 

f to Op. 2 

none 

LFSR 

read.D 

N/A 

D 

N/A 

N/A 

none 

SFSR 

N/A 

write. D 

N/A 

N/A 

D to Op. 2 

none 


D = Double Word 

i = Integer size (B, W, D) specified in mnemonic, 
f = Floating-Point type (F, L) specified in mnemonic. 
N/A = Not Applicable to this instruction. 
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4.0 Device Specifications 

4.1 PIN DESCRIPTIONS 

The following are brief descriptions of all NS32081 FPU 
pins. The descriptions reference the relevant portions of the 
Functional Description, Section 3. 

Dual-ln-Llne Package 



Top View 

FIGURE 4-1. Connection Diagram 

Order Number NS32081D-10 or NS32081D-15 
See NS Package Number D24C 

Order Number NS32081N-10 or NS32081N-15 
See NS Package Number N24A 


4.1.1 Supplies 

Power (Vcc): +5V positive supply. Section 3.1. 

Logic Ground (GNDL): Ground reference for on-chip logic. 
Section 3.1. 

Buffer Ground (GNDB): Ground reference for on-chip driv- 
ers connected to output pins. Section 3.1. 

4.1.2 Input Signals 

Clock (CLK): TTL-leve! clock signal. 

Reset (RST): Active low. Initiates a Reset, Section 3.3. 


Status (STO, ST1): Input from CPU. STO is the least signifi- 
cant bit. Section 3.4 encodings are: 

00 — (Reserved) 

01— Transferring Operation Word or Operand 

1 0— Reading Status Word 

1 1— Broadcasting Slave ID 

4.1.3 Input/Output Signals 

Slave Processor Control (SPC): Active low. Driven by the 
CPU as the data strobe for bus transfers to and from the 
NS32081 FPU, Section 3.4. Driven by the FPU to signal 
completion of an operation, Section 3.5.1. Must be held high 
with an external pull-up resistor while floating. 

Data Bus (D0-D15): 16-bit bus for data transfer. DO is the 
least significant bit. Section 3.4. 


4.2 ABSOLUTE MAXIMUM RATINGS 

Temperature Under Bias 
Storage Temperature 
All Input or Output Voltages 
with Respect to GND 
Power Dissipation 


0°Cto + 70°C 
— 65°C to + 1 50°C 

-0.5V to +7.0V 
1.5W 


If Military/Aerospace specified devices are required, 
please contact the National Semiconductor Sales 
Office/Distributors for availability and specifications. 

Note: Absolute maximum ratings indicate limits beyond 
which permanent damage may occur. Continuous operation 
at these limits is not intended; operation should be limited to 
those conditions specified under Electrical Characteristics. 


4.3 ELECTRICAL CHARACTERISTICS T a = 0°Cto 70°C, V C c = 5V ±5%, GND = 0V 


Symbol 


V| H 


V|L 


v OH 


v OL 



HIGH Level Input Voltage 


LOW Level Input Voltage 


HIGH Level Output Voltage 


LOW Level Output Voltage 


Input Load Current 


Leakage Current 
Output and I/O Pins in 
TRI-STATE/Input Mode 


Active Supply Current 



Conditions 


Iqh = “400 n A 


Iql = 4 mA 


0 :£ V|n ^ Vcc 


0.45 :£ Vin £ 2.4V 


•OUT = 0, T a = 25°C 
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4.0 Device Specifications (Continued) 


4.4 SWITCHING CHARACTERISTICS 
4.4.1 Definitions 

All the Timing Specifications given in this section refer to 0.8V ABBREVIATIONS 
and 2.0V on all the input and output signals as illustrated in |_ .E. — Leading Edge 


Figures 4.2 and 4.3, unless specifically stated otherwise. 


T.E. — Trailing Edge 


R.E. — Rising Edge 
F.E. — Falling Edge 




‘SIG1I 
0.45 V 


FIGURE 4-2. Timing Specification Standard 
(Signal Valid After Clock Edge) 


FIGURE 4-3. Timing Specification Standard 
(Signal Valid Before Clock Edge) 
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4.0 Device Specifications (continued) 

4.4.2 Timing Tables 

4.4.2. 1 Output Signal Propagation Delays 

Maximum times assume capacitive loading of 100 pF. 

Name 

Figure 

Description 

Reference/ 

Conditions 

NS32081-10 

NS32081-15 

Units 

Min 

Max 

Min 

Max 

bv 

4-7 

Data Valid 

After SPC L.E. 


45 


30 

ns 

bf 

4-7 

D0-D15 Floating 

After SPC T.E. 


50 

2 

35 

ns 

tSPCFw 

4-9 

SPC Pulse Width 

from FPU 

At 0.8V 
(Both Edges) 

bLKp “ 50 

bLKp + 50 

bLKp ~ 40 

bLKp + 40 

ns 

bPCFI 

4-9 

SPC Output Active 

After CLK R.E. 


55 


38 

ns 

bPCFh 

4-9 

SPC Output Inactive 

After CLK R.E. 


55 


38 

ns 

tSPCFnf 

4-9 

SPC Output 
Nonforcing 

After CLK F.E. 


45 


35 

ns 

4.4.2. 2 Input Signal Requirements 

Name 

Figure 

Description 

Reference/ 

Conditions 


Max 

Min 

Max 

Units 

IPWR 

4-5 

Power Stable to 
RSTR.E. 

After V C c 

Reaches 4.5V 

50 


50 


JUS 

bSTw 

4-6 

RST Pulse Width 

At 0.8V 
(Both Edges) 

64 


64 


bLKp 

bs 

4-7 

Status (ST0-ST1) 
Setup 

Before SPC L.E. 

50 


33 


ns 

*Sh 

4-7 

Status (ST0-ST1) 
Hold 

After SPC L.E. 

40 


35 


ns 

bs 

4-8 

D0-D15 Setup Time 

Before SPC T.E. 

40 


30 


ns 

bh 

4-8 

D0-D15 Hold Time 

After SPC T.E. 

50 


35 


ns 

bpcw 

4-7 

SPC Pulse Width 
from CPU 

At 0.8V 
(Both Edges) 

70 


50 


ns 

bpcs 

4-7 

SPC Input Active 

Before CLK R.E. 

40 


35 


ns 

bPCh 

4-7 

SPC Input Inactive 

After CLK R.E. 

0 


0 


ns 

bSTs 

4-10 

RST Setup 

Before CLK F.E. 

10 


10 


ns 

bSTh 

4-10 

RST R.E. Delay 

After CLK R.E. 

0 


0 


ns 

4.4. 2. 3 Clocking Requirements 

Name 

Figure 

Description 

Reference/ 

Conditions 

Min 

Max 

Min 

Max 

Units 

bLKh 

4-4 

Clock High Time 

At 2.0 V 
(Both Edges) 

42 

1000 

27 

1000 

ns 

bLKI 

4-4 

Clock Low Time 

At 0.8V 
(Both Edges) 

42 

1000 

27 

1000 

ns 

bLKp 

4-4 

Clock Period 

CLK R.E. to Next 

CLK R.E. 

100 

2000 

66 


ns 
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4.0 Device Specifications (Continued) 

4.4.3 Timing Diagrams 



TL/EE/5234-19 

FIGURE 4-4. Clock Timing 



TL/EE/5234-20 

FIGURE 4-5. Power-On Reset 


-jLTi n n n 



FIGURE 4-6. Non-Power-On Reset 


TL/EE/5234-21 




Note: SPC pulse may also be 2 clocks wide, but its edges must meet the tspcg and tspch requirements with respect to CLK. 
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4.0 Device Specifications (Continued) 

— *j tSPCfi — 


tsPCFh 


l 1 

n 

n 

_i 

r 




— 


H tSPCFw H 

FIGURE 4-9. SPC Pulse from FPU 


TL/EE/5234-24 


FIGURE 4-10. RST Release Timing 

Note: The rising edge of RST must occur while CLK is high, as shown. 
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National 
Semiconductor 

NS32202-10 Interrupt Control 

General Description 

The NS32202 Interrupt Control Unit (ICU) is the interrupt 
controller for the Series 32000® microprocessor family. It is 
a support circuit that minimizes the software and real-time 
overhead required to handle multi-level, prioritized inter- 
rupts. A single NS32202 manages up to 1 6 interrupt sources, 
resolves interrupt priorities, and supplies a single-byte interrupt 
vector to the CPU. 

The NS32202 can operate in either of two data bus modes: 
16-bit or 8-bit. In the 16-bit mode, eight hardware and eight 
software interrupt positions are available. In the 8-bit mode, 

1 6 hardware interrupt positions are available, 8 of which can 
be used as software interrupts. In this mode, up to 16 addi- 
tional ICUs may be cascaded to handle a maximum of 256 
interrupts. 

Two 16-bit counters, which may be concatenated under pro- 
gram control into a single 32-bit counter, are also available 
for real-time applications. 


Unit 

Features 

■ 16 maskable interrupt sources, cascadable to 256 

■ Programmable 8- or 16-bit data bus mode 

■ Edge or level triggering for each hardware interrupt with 
individually selectable polarities 

■ 8 software interrupts 

■ Fixed or rotating priority modes 

■ Two 16-bit, DC to 10 MHz counters, that may be con- 
catenated into a single 32-bit counter 

■ Optional 8-bit I/O port available in 8-bit data bus mode 

■ High-speed XMOStm technology 

■ Single, + 5V supply 

■ 40-pin, dual in-line package 



Basic System Configuration 



CASCADED 

INTERRUPT 

SOURCES 


TL/EE/5117-1 
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1.0 Product Introduction 

The NS32202 ICU functions as an overall manager in an 
interrupt-oriented system environment. Its many features 
and options permit the design of sophisticated interrupt sys- 
tems. 

Figure 1-1 shows the internal organization of the NS32202. 
As shown, the NS32202 is divided into five functional 
blocks. These are described in the following paragraphs: 

1.1 I/O BUFFERS AND LATCHES 

The I/O Buffers and Latches block is the interface with the 
system data bus. It contains bidirectional buffers for the 
data I/O pins. It also contains registers and logic circuits 

that control the operation of pins GO/IRO G7/IR14 

when the ICU is in the 8-bit bus mode. 

1.2 READ/WRITE LOGIC AND DECODERS 

The Read/Write Logic and Decoders manage all internal 
and external data transfers for the ICU. These include Data, 
Control, and Status Transfers. This circuit accepts inputs 
from the CPU address and control buses. In turn, it issues 
commands to access the internal registers of the ICU. 

1.3 TIMING AND CONTROL 

The Timing and Control Block contains status elements that 
select the ICU operating mode. It also contains state ma- 
chines that generate all the necessary sequencing and con- 
trol signals. 


1.4 PRIORITY CONTROL 

The Priority Control Block contains 16 units, one for each 
interrupt position. These units provide the following func- 
tions. 

• Sensing the various forms of hardware interrupt sig- 
nals e.g. level (high/low) or edge (rising/falling) 

• Resolving priorities and generating an interrupt re- 
quest to the CPU 

• Handling cascaded arrangements 

• Enabling software interrupts 

• Providing for an automatic return from interrupt 

• Enabling the assignment of any interrupt position to 
the internal counters 

• Providing for rearrangement of priorities by assigning 
the first priority to any interrupt position 

• Enabling automatic rotation of priorities 

1.5 COUNTERS 

This block contains two 16-bit counters, called the H-coun- 
ter and the L-counter. These are down counters that count 
from an initial value to zero. Both counters have a 16-bit 
register (designated HCSV and LCSV) for loading their re- 
starting values. They also have registers containing the cur- 
rent count values (HCCV and LCCV). Both sets of registers 
are fully described in Section 3. 


_L J. 


ST1 IKT IR1 IR3 IR5 IR7 IRS IR11 IR13 IRIS 



I/O BUFFERS 
AND 

LATCHES 


READ/WRITE LOGIC 
AND DECODERS 


TTTTTTTTTT 

RST RO WR C5 REE AO A1 A2 A3 A4 


FIGURE 1-1. NS32202 ICU Block Diagram 
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1.0 Product Introduction (Continued) 

The counters are under program control and can be used to 
generate interrupts. When the count reaches zero, either 
counter can generate an interrupt request to any of the 1 6 
interrupt positions. The counter then reloads the start value 
from the appropriate registers and resumes counting. Figure 
1-2 shows typical counter output signals available from the 
NS32202. 

The maximum input clock frequency is 2.5 MHz. 

A divide-by-four prescaler is also provided. When the pre- 
scaler is used, the input clock frequency can be up to 10 
MHz. 

When intervals longer than provided by a 16-bit counter are 
needed, the L- and H-counters can be concatenated to form 
a 32-bit counter. In this case, both counters are controlled 
by the H-counter control bits. Refer to the discussion of the 
Counter Control Register in Section 3 for additional informa- 
tion. Figure 1-3 summarizes counter read/write operations. 


2.0 Functional Description 

2.1 RESET 

The ICU is reset when a logic low signal is present on the 
RST pin. At reset, most internal ICU registers are affected, 
and the ICU becomes inactive. 

2.2 INITIALIZATION 

After reset, the CPU must initialize the NS32202 to establish 
its configuration. Proper initialization requires knowledge of 
the ICU register’s formats. Therefore, a flowchart of a rec- 
ommended initialization sequence is shown in (Figure 3-3) 
after the discussion of the ICU registers. 

The operation sequence shown in Figure 3-3 ensures that 
all counter output pins remain inactive until the counters are 
completely initialized. 

2.3 VECTORED INTERRUPT HANDLING 

For details on the operation of the vectored interrupt mode 
for a particular Series 32000 CPU, refer to the data sheet for 
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FIGURE 1-2. Counter Output Signals In Pulsed Form and Square Waveform for Three Different Initial Values 




2.0 Functional Description (Continued) 

that CPU. In this discussion, it is assumed that the NS32202 
is working with a CPU in the vectored interrupt mode. Sever- 
al ICU applications are discussed, including non-cascaded 
and cascaded operation. Figures 2- 1, 2-2, and 2-3 show 
typical configurations of the ICU used with the NS32016 
CPU. 

A peripheral device issues an interrupt request by sending 
the proper signal to one of the NS32202 interrupt inputs. If 
the interrupt input is not masked, the ICU activates its Inter- 


rupt Output (iNT) pin and generates an interrupt vector byte. 
The interrupt vector byte identifies the interrupt source in its 
four least significant bits. When the CPU detects a low level 
on its Interrupt Input pin, it performs one or two interrupt 
acknowledge cycles depending on whether the interrupt re- 
quest is from the master ICU or a cascaded ICU. Figure 2-4 
shows a flowchart of a typical CPU Interrupt Acknowledge 
sequence. 
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BASIC OPERATIONS: 

WRITING TO LCSV/HCSV ® «- (IDB) 

READING LCSV/HCSV ® -» (IDB) 

WRITING TO LCCV/HCCV ® «- (IDB) 

(only possible when counters are halted) 0 <— (IDB) 

READING LCCV/HCCV 0 (IDB) 

(only possible when counter 
readings are frozen) 

COUNTER COUNTS AND READINGS ARE 

NOT FROZEN © «- ® 

COUNTER RELOADS STARTING VALUE ® <- ® 

(occurs on the clock cycle following 
the one in which it reaches zero) 


FIGURE 1-3. Counter Configuration and Basic Operations 
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2.0 Functional Description (Continued) 



D0-D15 
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FIGURE 2-1. Interrupt Control Unit Connections in 16-Bit Bus Mode 



D0-D15 


NOTE: In the 8-Bit Bus Mode the Master ICU Registers appear at even 
addresses (AO = 0) since the ICU communicates with the least sig- 
nificant byte of the CPU data bus. 
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FIGURE 2-2. Interrupt Control Unit Connections in 8-Bit Bus Mode 
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2.0 Functional Description (Continued) 



Cond. A is true if current instruction is terminated 
or an interruptible point in a string instruction is 
reached. 
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FIGURE 2-4. CPU Interrupt Acknowledge Sequence 
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2.0 Functional Description (Continued) 
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FIGURE 2-6. CPU Return from Interrupt Sequence 

The master ICU maintains a list (in the CSRC register pair) 
of its interrupt positions that are cascaded. It also provides a 
4-bit (hidden) counter (in-service counter) for each interrupt 
position to keep track of the number of interrupts being 
sen/iced in the cascade ICUs. When a cascaded interrupt 
input is active, the master ICU activates its interrupt output 
and the CPU responds with a Master Interrupt Acknowledge 
Cycle. However, instead of generating a positive interrupt 
vector, the master ICU generates a negative Cascade Table 
index. 

The CPU interprets the negative number returned from the 
master ICU as an index into the Cascade Table. The Cas- 
cade Table is located in a negative direction from the Dis- 
patch Table, and it contains the virtual addresses of the 
hardware vector registers for any cascaded NS32202s in 
the system. Thus, the Cascade Table index supplied by the 
master ICU identifies the cascaded ICU that requested the 
interrupt. 

Once the cascaded ICU is identified, the CPU performs a 
Cascaded Interrupt Acknowledge cycle. During this cycle, 
the CPU reads the final vector value directly from the cas- 
caded ICU, and uses it to access the Dispatch Table. Each 


cascaded ICU, of course, has its own set of 16 unique inter- 
rupt vectors, one vector for each of its 1 6 interrupt positions. 
The CPU interprets the vector value read during a Cascad- 
ed Interrupt Acknowledge cycle as an unsigned number. 
Thus, this vector can be in the range 0 through 255. 

When a cascaded interrupt service routine completes its 
task, it must return control to the interrupted program with 
the same RETI instruction used in non-cascaded interrupt 
service routines. However, when the CPU performs a Mas- 
ter Return From Interrupt cycle, the CPU accesses the mas- 
ter ICU and reads the negative Cascade Table index identi- 
fying the cascaded ICU that originally received the interrupt 
request. Using the cascaded ICU address, the CPU now 
performs a Cascaded Return From Interrupt cycle, informing 
the cascaded ICU that the service routine is over. The byte 
provided by the cascaded ICU during this cycle is ignored. 

2.4 INTERNAL ICU OPERATING SEQUENCE 

The NS32202 ICU accepts two interrupt types, software and 
hardware. 

Software interrupts are initiated when the CPU sets the 
proper bit in the Interrupt Pending (IPND) registers (R6, R7), 
located in the ICU. Bits are set and reset by writing the 
proper byte to either R6 or R7. Software interrupts can be 
masked, by setting the proper bit in the mask registers (RIO, 
R11). 

Hardware interrupts can be either internal or external to the 
ICU. Internal ICU hardware interrupts are initiated by the on- 
chip counter outputs. External hardware interrupts are initia- 
ted by devices external to the ICU, that are connected to 
any of the ICU interrupt input pins. 

Hardware interrupts can be masked by setting the proper bit 
in the mask registers (RIO, R11). If the Freeze bit (FRZ), 
located in the Mode Control Register (MCTL), is set, all in- 
coming hardware interrupts are inhibited from setting their 
corresponding bits in the IPND registers. This prevents the 
ICU from recognizing any hardware interrupts. 

Once the ICU is initialized, it is enabled to accept interrupts. 
If an active interrupt is not masked, and has a higher priority 
than any interrupt currently being serviced, the ICU acti- 
vates its Interrupt Output (INT). Figure 2-7 is a flowchart 
showing the ICU interrupt acknowledge sequence. 

The CPU responds to the active INT line by performing an 
Interrupt Acknowledge bus cycle. During this cycle, the ICU 
clears the IPND bit corresponding to the active interrupt po- 
sition and sets the corresponding bit in the Interrupt In-Serv- 
ice Registers (ISRV). The 4-bit in-service counter in the 
master ICU is also incremented by one if the fixed priority 
mode is selected and the interrupt is from a cascaded ICU. 
The ISRV bit remains set until the CPU performs a RETI bus 
cycle and the 4-bit in-service counter is decremented to 
zero. Figure 2-8 is a flowchart showing ICU operation dur- 
ing a RETI bus cycle. 

When the ISRV bit is set, the INT output is disabled. This 
output remains inactive until a higher priority interrupt posi- 
tion becomes active, or the ISRV bit is cleared. 

An exception to the above occurs in the master ICU when 
the fixed priority mod e is selected, and the interrupt input is 
connected to the INT output of a cascaded ICU. In this case 
the ISRV bit does not inhibit an interrupt of the same priority. 
This is to allow nesting of interrupts in a cascaded ICU. 
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2.0 Functional Description (Continued) 
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* Cond. B is true if any one of the following condi- 
tions is satisfied. 

1) No interrupt is being serviced 

2) There is a pending unmasked interrupt with 
priority higher than that of the interrupt being 
serviced. 

3) There is a pending unmasked interrupt from a 
cascaded ICU with priority higher or same as that 
of the highest priority interrupt position in the 
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FIGURE 2-7. ICU Interrupt Acknowledge Sequence 
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2.0 Functional Description (Continued) 

2.5 INTERRUPT PRIORITY MODES 

The NS32202 ICU can operate in one of four interrupt priori- 
ty modes: Fixed Priority; Auto-Rotate; Special Mask; and 
Polling. Each mode is described below. 

2.5.1 Fixed Priority Mode 

In the Fixed Priority Mode (also called Fully Nested Mode), 
each interrupt position is ranked in priority from 0 to 15, with 
0 being the highest priority. In this mode, the processing of 
lower priority interrupts is nested with higher priority inter- 
rupts. Thus, while an interrupt is being serviced, any other 
interrupts of the same or lower priority are inhibited. The ICU 
does, however, recognize higher priority interrupt requests. 
When the interrupt service routine executes its RETI instruc- 
tion, the corresponding ISRV bit is cleared. This allows any 
lower priority interrupt request to be serviced by the CPU. 
At reset, the default priority assignment gives interrupt IRO 
priority 0 (highest priority), interrupt IR1 priority 1, and so 
forth. Interrupt IR15 is, of course, assigned priority 15, the 
lowest priority. The default priority assignment can be al- 
tered by writing an appropriate value into register FPRT (L) 
as explained in Section 3.9. 

Note: When the ICU generates an interrupt request to the CPU for a higher 
priority interrupt while a lower priority interrupt is still being serviced by 
the CPU, the CPU responds to the interrupt request only if its internal 
interrupt enable flag is set. Normally, this flag is reset at the beginning 
of an interrupt acknowledge cycle and set during the RETI cycle. If the 
CPU is to respond to higher priority interrupts during any interrupt 
service routine, the service routine must set the internal CPU interrupt 
enable flag, as soon during the sen/ice routine as desired. 

2.5.2 Auto-Rotate Mode 

The Auto Rotate Mode is selected when the NTAR bit is set 
to 0, and is automatically entered after Reset. In this mode 
an interrupt source position is automatically assigned lowest 
priority after a request at that position has been serviced. 
Highest priority then passes to the next lower priority posi- 
tion. For example, when servicing of the interrupt request at 
position 3 is completed (ISRV bit 3 is cleared), interrupt po- 
sition 3 is assigned lowest priority and position 4 assumes 
highest priority. The nesting of interrupts is inhibited, since 
the interrupt being serviced always has the highest priority. 
This mode is used when the interrupting devices have to be 
assigned equal priority. A device requesting an interrupt, will 
have to wait, in the worst case, until each of the 1 5 other 
devices has been serviced at most once. 

2.5.3 Special Mask Mode 

The Special Mask Mode is used when it is necessary to 
dynamically alter the ICU priority structure while an interrupt 
is being serviced. For example, it may be desired in a partic- 
ular interrupt service routine to enable lower priority inter- 
rupts during a part of the routine. To do so, the ICU must be 
programmed in fixed priority mode and the interrupt service 
routine must control its own in-service bit in the ISRV regis- 
ters. 


The bits of the ISRV registers are changed with either the 
Set Bit Interlocked or Clear Bit Interlocked instructions (SBI- 
TIW or CBITIW). The in-service bit is cleared to enable low- 
er priority interrupts and set to disable them. 

Note: For proper operation of the ICU, an interrupt service routine must set 
its ISRV bit before executing the RETI instruction. This prevents the 
RETI cycle from clearing the wrong ISRV bit. 

2.5.4 Polling Mode 

The Polling Mode gives complete control of interrupt priority 
to the system software. Either some or all of the interrupt 
positions can be assigned to the polling mode. To assign all 
interrupt positions to the polling mode, the CPU interrupt 
enable flag is reset. To assign only some of the interrupt 
positions to the polling mode, the desired interrupt positions 
are masked in the Interrupt Mask registers (IMSK). In either 
case, the polling operation consists of reading the Interrupt 
Pending (IPND) registers. 

If necessary, the IPND read can be synchronized by setting 
the Freeze (FRZ) bit in the Mode Control register (MCTL). 
This prevents any change in the IPND registers during the 
read. The FRZ bit must be reset after the polling operation 
so the IPND contents can be updated. If an edge-triggered 
interrupt occurs while the IPND registers are frozen, the in- 
terrupt request is latched, and transferred to the IPND regis- 
ters as soon as FRZ is reset. 

The polling mode is useful when a single routine is used to 
sen/ice several interrupt levels. 

3.0 Architectural Description 

The NS32202 has thirty-two 8-bit registers that can be ac- 
cessed either individually or in pairs. In 16-bit data bus 
mode, register pairs can be accessed with the CPU word or 
double-word reference instructions. Figure 3 - 1 shows the 
ICU internal registers. This figure summarizes the name, 
function, and offset address for each register. 

Because some registers hold similar data, they are grouped 
into functional pairs and assigned a single name. However, 
if a single register in a pair is referenced, either an L or an H 
is appended to the register name. The letters are placed in 
parentheses and stand for the low order 8 bits (L) and the 
high order 8 bits (H). For example, register R6, part of the 
Interrupt Pending (IPND) register pair, is referred to individu- 
ally as IPND(L). 

The following paragraphs give detailed descriptions of the 
registers shown in Figure 3-1. 

3.1 HVCT — HARDWARE VECTOR REGISTER (R0) 

The HVCT register is a single register that contains the in- 
terrupt vector byte supplied to the CPU during an Interrupt 
Acknowledge (INTA) or Return From Interrupt (RETI) cycle. 
The HVCT bit map is shown below: 
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3.0 Architectural Description (Continued) 


REG. NUMBER AND REG. REG. FUNCTION 

ADDRESS IN HEX. NAME 



HARDWARE VECTOR 
SOFTWARE VECTOR 
EDGE/LEVEL TRIGGERING 
TRIGGERING POLARITY 
INTERRUPTS PENDING 
INTERRUPTS IN-SERVICE 
INTERRUPT MASK 
CASCADED SOURCE 
FIRST PRIORITY 
MODE CONTROL 
OUTPUT CLOCK ASSIGNMENT 
COUNTER INTERRUPT POINTER 
PORT DATA 

INTERRUPT/PORT SELECT 
PORT DIRECTION 
COUNTER CONTROL 
COUNTER INTERRUPT CONTROL 
L-COUNTER STARTING VALUE 
H-COUNTER STARTING VALUE 
L-COUNTER CURRENT VALUE 
H-COUNTER CURRENT VALUE 


FIGURE 3-1. ICU Internal Registers 
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3.0 Architectural Description (Continued) 

The BBBB field is the bias which is programmed by writing 
BBBBOOOO 2 to the SVCT register (R1 ). The WVV field iden- 
tifies one of the 16 interrupt positions. The contents of the 
HVCT register provide various information to the CPU, as 
shown in Figure 3-2. 

Note 1: The ICU always interprets a read of the HVCT register as either an 
INTA or RETI cycle. Since these cycles cause Internal changes to 
the ICU, normal programs must never read the ICU HVCT register. 
Note 2: If the HVCT register is read with ST1 = 0 (INTA cycle) and no 
unmasked interrupt is pending, the binary value 6BBB1 1 11 is re- 
turned and any pending edge-triggered interrupt in position 1 5 is 
cleared. 

If the auto-rotate priority mode is selected, the FPRT register is also 
cleared, thus preventing any interrupt from being acknowledged. In 
this case a re-intialization of the FPRT register is required for the 
ICU to acknowledge interrupts again. 

If a read of the HVCT register is performed with ST1 = 1 (RETI 
cycle), the binary value BBBB1 1 1 1 is returned. 

If the auto-rotate mode is selected, a priority rotation is also per- 
formed. 

3.2 SVCT — SOFTWARE VECTOR REGISTER (R1) 

The SVCT register is a copy of the HVCT register. It allows 
the programmer to read the contents of the HVCT register 
without initiating a INTA or RETI cycle in the ICU. It also 
allows a programmer to change the BBBB field of the HVCT 
register. The bit map of the SVCT register is the same as for 
the HVCT register. 

During a write to SVCT, the four least significant bits are 
unaffected while the four most significant bits are written 
into both SVCT and HVCT (R1 and R0). 

The SVCT register is updated dynamically by the ICU. The 
four least significant bits always contain the vector value 
that would be returned to the CPU if a INTA or RETI cycle 
were executed. Therefore, when reading the SVCT register, 
the state of the CPU ST1 pin is used to select either pend- 
ing interrupt data or in-service interrupt data. For example, if 
the SVCT register is read with ST1 = 0 (as for an INTA 
cycle), the WVV field contains the encoded value of the 
highest priority pending interrupt. On the other hand, if the 
SVCT register is read with ST1 = 1, the WVV field contains 
the encoded value of the highest priority in-service interrupt. 
Note: If the CPU ST1 output is connected directly to the ICU ST1 input, the 
vector read from SVCT is always the RETI vector. If both the INTA 
and RETI vectors are desired, additional logic must be added to drive 
the ICU ST1 input. A typical circuit is shown below. In this circuit, the 
state of the ICU ST1 input Is controlled by both the CPU ST1 output 
and the selected address bit. 



TL/EE/5117-14 

3.3 ELTG — EDGE/LEVEL TRIGGERING 
REGISTERS (R2, R3) 

The ELTG registers determine the input trigger mode for 
each of the 16 interrupt inputs. Each input is assigned a bit 
in this register pair. An interrupt input is level-triggered if its 
bit in ELTG is set to 1 . The input is edge-triggered if its bit is 
cleared. At reset, all bits in ELTG are set to 1 . 

If odd-numbered interrupt positions must be used for soft- 
ware interrupts, the edge triggering mode must be selected 
and the corresponding interrupt inputs should be prevented 
from changing state. 

3.4 TPL — TRIGGERING POLARITY 
REGISTERS (R4, R5) 

The TPL registers determine the polarity of either the active 
level or the active edge for each of the 16 interrupt inputs. 
As with the ELTG registers, each input is assigned a bit. 
Possible triggering modes for the various combinations of 
ELTG and TPL bits are shown below. 

ELTG BIT TPL BIT TRIGGERING MODE 
0 0 Falling Edge 

0 1 Rising Edge 

1 0 Low Level 

1 1 High Level 

Software interrupt positions are not affected by their TPL 
bits. At reset, all TPL bits are set to 0. 

Note 1: If edged-triggered interrupts are to be handled, the TPL register 
should be programmed before the ELTG register. 

This prevents spurious interrupt requests from being generated dur- 
ing the ICU initialization from edge-triggered interrupt positions. 
Note 2: Hardware interrupt inputs connected to cascaded ICUs must have 
their TPL bits set to 0. 

3.5 IPND — INTERRUPT PENDING REGISTERS (R6, R7) 

The IPND registers track interrupt requests that are pending 
but not yet serviced. Each interrupt position is assigned a bit 
in IPND. When an interrupt is pending, the corresponding bit 
in IPND is set. The IPND data are used by the ICU to gener- 
ate interrupts to the CPU. These data are also used in poll- 
ing operations. 


RRRR 

INTA CYCLE (ST1 = 0) 

RETI CYCLE (ST1 = 1) 

Highest priority pending interrupt is from: 
cascaded ICU | any other source 

Highest priority in-service interrupt was from: 
cascaded ICU | any other source 


1111 

programmed bias* 

1111 

programmed bias* 

vvw 

encoded value of the highest 
priority pending interrupt 

encoded value of the highest 
priority in-service interrupt 


•The Programmed bias for the master ICU must range from 0000 to 01 1 12 because the CPU interprets a one in the most significant bit position as a Cascade Table 
Index indicator for a cascaded ICU. 


FIGURE 3-2. HVCT Register Data Coding 
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3.0 Architectural Description (Continued) 

The IPND registers are also used for requesting software 
interrupts. This is done by writing specially formatted data 
bytes to either IPND(L) or IPND(H). The formats differ for 
registers R6 and R7. These formats are shown below: 
IPND(L) (R6) — S0000PPP 
IPND(H) (R7) — S0001PPP 
Where: S = Set (S = 1) or Clear (S = 0) 

PPP = is a binary number identifying one of 
eight bits 

Note: The data read from either R6 or R7 are different from that written to 
the register because the ICU returns the register contents, rather than 
the formatted byte used to set the register bits. 

The ICU automatically clears a set IPND bit when the pend- 
ing interrupt request is serviced. All pending interrupts in a 
register can be cleared by writing the pattern 'X1XXXXXX' 
to it (X = don’t care). To avoid conflicts with asynchronous 
hardware interrupt requests, the IPND registers should be 
frozen before pending interrupts are cleared. Refer to the 
Mode Control Register description for details on freezing 
the IPND registers. 

At reset, all IPND bits are set to 0. 

Note: The edge sensing mechanism used for hardware interrupts in the 
NS32202 ICU is a latching device that can be cleared only by ac- 
knowledging the interrupt or by changing the trigger mode to level 
sensing. Therefore, before clearing pending interrupts in the IPND 
registers, any edge-triggered interrupt inputs must first be switched to 
the level-triggered mode. This clears the edge-triggered interrupts; 
the remaining interrupts can then be cleared in the manner described 
above. This applies to clearing the interrupts only. Edge-triggered in- 
terrupts can be set without changing the trigger mode. 

3.6 ISRV — INTERRUPT IN-SERVICE 
REGISTERS (R8, R9) 

The ISRV registers track interrupt requests that are current- 
ly being serviced. Each interrupt position is assigned a bit in 
ISRV. When an interrupt request is serviced by the ICU, its 
corresponding bit is set in the ISRV registers. Before gener- 
ating an interrupt to the CPU, the ICU checks the ISRV reg- 
isters to ensure that no higher priority interrupt is currently 
being serviced. 

Each time the CPU executes an RETI instruction, the ICU 
clears the ISRV bit corresponding to the highest priority in- 
terrupt in service. The ISRV registers can also be written 
into by the CPU. This is done to implement the special mask 
priority mode. 

At reset, the ISRV registers are set to 0. 

Note: If the ICU initialization does not follow a hardware reset, the ISRV 
register should be cleared during initialization by writing zeroes into it. 


3.7 IMSK — INTERRUPT MASK REGISTERS (RIO, R1 1) 

Each NS32202 interrupt position can be individually 
masked. A masked interrupt source is not acknowledged by 
the ICU. The IMSK registers store a mask bit for each of the 
ICU interrupt positions. If an interrupt position’s IMSK bit is 
set to 1, the position is masked. 

The IMSK registers are controlled by the system software. 
At reset, all IMSK bits are set to 1, disabling all interrupts. 
Note: If an interrupt must bs masked off, the CPU can do so by setting the 
corresponding bit in the IMSK register. However, if an interrupt is set 
pending during the CPU instruction that masks off that interrupt, the 
CPU may still perform an interrupt acknowledge cycle following that 
Instruction since it might have sampled the INT line before the ICU 
deasserted it. This could cause the ICU to provide an invalid vector. 
To avoid this problem, the above operation should be performed with 
the CPU interrupt disabled. 

3.8 CSRC — CASCADED SOURCE 
REGISTERS (R12.R13) 

The CSRC registers track any cascaded interrupt positions. 
Each interrupt position is assigned a bit in the CSRC regis- 
ters. If an interrupt position’s CSRC bit is set, that position is 
connected to the INT output of another NS32202 ICU, i.e., it 
is a cascaded interrupt. 

At reset, the CSRC registers are set to 0. 

Note 1: If any cascaded ICU Is used, the CSRC register should be cleared 
during Initialization (if the initialization does not follow a hardware 
reset) by writing zeroes into it. This should be done before setting 
the bits corresponding to the cascaded interrupt positions. This op- 
eration ensures that the 4-bit in-service counters (associated with 
each Interrupt position to keep track of cascaded interrupts) always 
get cleared when the ICU Is re-initiaiized. 

Note 2: Only the Master ICU should have any CSRC bits set. If CSRC bits 
are set in a cascaded ICU, incorrect operation results. 


3.9 FPRT — FIRST PRIORITY REGISTERS (R14, R15) 

The FPRT registers track the ICU interrupt position that cur- 
rently holds first priority. Only one bit of the FPRT registers 
is set at one time. The set bit indicates the interrupt position 
with first (highest) priority. 


The FPRT registers are automatically updated when the ICU 
is in the auto-rotate mode. The first priority interrupt can be 
determined by reading the FPRT registers. This operation 
returns a 1 6-bit word with only one bit set. An interrupt posi- 
tion can be assigned first priority by writing a formatted data 
byte to the FPRT(L) register. The format is shown below: 

7 6 5 4 3 2 1 0 



Where: XXXX = Don’t Care 


FFFF = A binary number from 0 to 15 indi- 
cating the interrupt position as- 
signed first priority. 

Note: The byte above is written only to the FPRT(L) register. Any data writ- 
ten to FPRT(H) is ignored. 

At reset the FFFF field is set to 0, thus giving interrupt posi- 
tion 0 first priority. 


3.10 MCTL — MODE CONTROL REGISTER (R16) 

The contents of the MCTL set the operating mode of the 
NS32202 ICU. The MCTL bit map is shown below. 
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3.0 Architectural Description (Continued) 

CFRZ Determines whether or not the NS32202 coun- 
ter readings are frozen. When frozen, the 
counters continue counting but the LCCV and 
HCCV registers are not updated. Reading of 
the true value of LCCV and HCCV is possible 
only while they are frozen. 

CFRZ = 0 = > LCCV and HCCV Not Frozen 
CFRZ = 1 = > LCCV and HCCV Frozen 
COUTD Determines whether the COUT/SCIN pin is an 
input or an output. COUT/SCIN should be 
used as an input only for testing purposes. In 
this case an external sampling clock must be 
provided otherwise hardware interrupts will not 
be recognized. 

COUTD = 0 = > COUT/SCIN is Output 
COUTD = 1 = > COUT/SCIN is Input 
COUTM When the COUT/SCIN pin is programmed as 
an output (COUTD = 0), this bit determines 
whether the output signal is in pulsed form or in 
square wave form. 

COUTM = o = > Square Wave Form 
COUTM = 1 = > Pulsed Form 
CLKM Used only in the 8-bit Bus Mode. This bit con- 
trols the clock wave form on any of the pins 

GO/IRO G3/IR6 programmed as counter 

output. 

CLKM = 0 = > Square Wave Form 
CLKM = 1 = > Pulsed Form 
FRZ Freeze Bit. In order to allow a synchronous 

reading of the interrupt pending registers 
(IPND), their status may be frozen, causing the 
ICU to ignore incoming requests. This is of spe- 
cial importance if a polling method is used, 

FRZ = 0 = > IPND Not Frozen 
FRZ = 1 = > IPND Frozen 

NTAR Determines whether the ICU is in the AUTO- 
ROTATE or FIXED Priority Mode. In AUTO- 
ROTATE mode, the interrupt source at the 
highest priority position, after being serviced, is 
assigned automatically lowest priority. In this 
mode, the interrupt in service always has high- 
est priority and nesting of interrupts is therefore 
inhibited. 

NTAR = 0 = > Auto-Rotate Mode 
NTAR = 1 = > Fixed Mode 
T16N8 Controls the data bus mode of operation. 
T16N8 = 0 => 8-Bit Bus Mode 
T16N8 = 1 => 16-Bit Bus Mode 
At reset, all MCTL bits except COUTD, are reset to 0. 
COUTD is set to 1 . 

3.1 1 OCASN — OUTPUT CLOCK 
ASSIGNMENT REGISTER (R17) 

Used only in the 8-bit Bus Mode. The four least significant 
bits of this register control the output clock assignments on 
pins GO/IRO, . . . .G3/IR6. If any of these bits is set to 1, the 
clock generated by either the H-Counter or the H + L-Coun- 
ter will be output to the corresponding pin. The four most 
significant bits of OCASN are not used. At Reset the four 
least significant bits are set to 0. 


Note: The interrupt sensing mechanism on pins GO/IRO G3/IR6 Is not 

disabled when any ol these pins is programmed as clock output. 
Thus, to avoid spurious Interrupts, the corresponding bits in register 
IPS should also be set to zero. 

3.12 CIPTR — COUNTER INTERRUPT 
POINTER REGISTER (R18) 

The CIPTR register tracks the assignment of counter out- 
puts to interrupt positions. A bit map of this register is shown 
below. 


7 

6 

5 

4 

3 

2 

1 

0 

H 

H 

H 

H 

L 

L 

L 

LlJ 


Where: HHHH = A 4-bit binary number identifying the 
interrupt position assigned to the H- 
Counter (or the H + L-counter if the 
counters are concatenated). 

LLLL = A 4-bit binary number identifying the 
interrupt position assigned to the L- 
counter. 

Note: Assignment ot a counter output to an interrupt position also requires 
control bits to be set in the CICTL register. If a counter output is 
assigned to an interrupt position, external hardware interrupts at that 
position are ignored. 

At reset, all bits in the CIPTR are set to 1. (This means both 
counters are assigned to interrupt position 1 5.) 

3.13 PDAT — PORT DATA REGISTER (R19) 

Used only in the 8-bit Bus Mode. This register is used to 
input or output data through any of the pins GO/ 
IRO G7/IR14 programmed as I/O ports by the IPS reg- 

ister. Any pin programmed as an output delivers the data 
written into PDAT. The input pins ignore it. Reading PDAT 
provides the logical value of all I/O pins, INPUT and OUT- 
PUT. 

3.14 IPS — INTERRUPT/PORT SELECT REGISTER (R20) 

Used only in the 8-bit Bus Mode. This register controls the 

function of the pins GO/IRO G7/IR14. Each of these 

pins is individually programmed as an I/O port, if the corre- 
sponding bit of IPS is 0; as an interrupt source, if the corre- 
sponding bit is 1 . The assignment of the H-Counter output 

to G0/IR0 G3/IR6 by means of reg. OCASN overrides 

the assignment to these pins as I/O ports or interrupt in- 
puts. 

At Reset, all the IPS bits are set to 1. 

Note: Whenever a bit in the IPS register is set to zero, to program the 
corresponding pin as an I/O port, any pending interrupt on the corre- 
sponding interrupt position will be cleared. 

3.15 PDIR — PORT DIRECTION REGISTER (R21) 

Used only in the 8-bit Bus Mode. This register determines 
the direction of any of the pins G0/IR0 G7/IR14 pro- 

grammed as I/O ports by the IPS register. A logic 1 indi- 
cates an input, while a logic 0 indicates an output. 

At Reset, all the PDIR bits are set to 1. 

3.16 CCTL — COUNTER CONTROL REGISTER (R22) 


The CCTL register controls the operating modes of the 
counters. A bit map of CCTL is shown below. 


7 

6 

5 

4 

3 

2 

1 

0 

CCON 

CFNPS 

COUT1 

COUTO 

CRUNH 

CRUNL 

CDCRH 

CDCRL 


CCON Determines whether the counters are indepen- 
dent or concatenated to form a single 32-bit 
counter (H -I- L-Counter). If a 32-bit counter is 
selected, the bits corresponding to the H- 
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3.0 Architectural Description (Continued) 

Counter will control the H + L-Counter, while 
the bits corresponding to the L-Counter are not 
used. 

CCON = 0 = > Two 16-bit Counters 
CCON = 1 = > One 32-bit Counter 
CFNPS Determines whether the external clock is 
prescaled or not. 

CFNPS = 0 = > Clock Prescaled (divided by 4) 
CFNPS = 1 = > Clock Not Prescaled. 

COUT1 & 

COUTO These bits are effective only when the COUT/ 
SCIN pin is programmed as an OUTPUT 
(COUTD bit in reg. MCTL is 0). Their logic lev- 
els are decoded to provide different outputs for 
COUT/SCIN, as detailed in the table below: 


COUT1 

COUTO COUT/SCIN Output Signal 

0 

0 Internal Sampling Oscillator 

0 

1 Zero Detect Of L-Counter 

1 

0 Zero Detect Of H-Counter 

1 

1 Zero Detect Of H + L-Counter* 


•If the H- and L-Counters are not concatenated and 
COUT1/COUTO are both 1, the COUT/SCIN pin is active 
when either counter reaches zero. 


CRUNH Determines the state of either the H-Counter or 
the H + L-Counter, depending upon the status 
of CCON. 

CRUNH = 0 = > H-Counter or H + L-Counter 
Halted 

CRUNH = 1 = > H-Counter or H + L-Counter 
Running 

CRUNL Effective only when CCON = 0. This bit deter- 
mines whether the L-Counter is running or halt- 
ed. 

CRUNL = 0 = > L-Counter Halted 
CRUNL = 1 = > L-counter Running 
CDCRH Effective only when CRUNH = 0 (Counter Halt- 
ed). This bit is the single cycle decrement sig- 
nal for either the H-Counter or the H + L-Coun- 
ter. 

CDCRH = 0 = > No Effect 

CDCRH = 1 => Decrement H-Counter or 

H + L-Counter 

CDCRL Effective only when CRUNL = 0 and CCON = 
0. This bit is the single cycle decrement signal 
for the L-Counter. 

CDCRL = 0 = > No Effect 
CDCRL = 1 = > Decrement L-Counter 
Note: The bits CDCRL and CDCRH are set when a logic 1 is written into 
them, but, they are automatically cleared after the end of the write 
operation. This Is needed to accomplish the decrement operation. 
Therefore, these bits always contain 0 when read. 

Reset does not affect the CCTL bits. 

3.17 CICTL — COUNTER INTERRUPT 
CONTROL REGISTER (R23) 

The CICTL register controls the counter interrupts and rec- 
ords counter interrupt status. Interrupts can be generated 
from either of the 16-bit counters. When the counters are 
concatenated, the interrupt control is through the H-Counter 


control bits. In this case the CIEL bit should be set to zero to 
avoid spurious interrupts from the L-Counter. A bit map of 
the CICTL register is shown following. 

7 6 5 4 3 2 1 0 

CERH CIRH CIEH WENH CERL CIRL CIEL WENL 

CERH H-Counter Error Flag. This bit is set (1) when a 
second interrupt request from the H-Counter 
(or H + L-Counter) occurs before the first re- 
quest is acknowledged. 

CIRH H-Counter Interrupt Request. It is set (1) when 
an interrupt is pending from the H-Counter (or 
H + L-Counter). It is automatically reset when 
the interrupt is acknowledged. 

CIEH H-Counter Interrupt Enable. When it is set, the 
H-Counter (or H + L-Counter) interrupt is en- 
abled. 


WENH H-Counter Control Write Enable. When WEHN 
is set (1), bits CERH, CIRH, and CIEH can be 
written. 


CERL L-Counter Error Flag. This bit is set (1) when a 
second interrupt request from the L-Counter 
occurs before the first request is acknowl- 
edged. 

CIRL L-Counter Interrupt Request. It is set (1) when 
an interrupt is pending from the L-Counter. It is 
automatically reset when the interrupt is ac- 
knowledged. 

CIEL L-Counter Interrupt Enable. When it is set (1), 
the L-Counter interrupt is enabled. 

WENL L-Counter Control Write Enable. When WENL 
is set (1), bits CERL, CIRL, and CIEL can be 
written. 


Note: Setting the write enable bits (WENH or WENL) and writing any of the 
other CICTL bits are concurrent operations. That is, the ICU will ig- 
nore any attempt to alter CICTL bits if the proper write enable bit is 
not set in the data byte. 


At reset, all CICTL bits are set to 0. However, if the counters 
are running, the bits CIRL, CERL, CIRH and CERH may be 
set again after the reset signal is removed. 


3.18 LCSV/HCSV — L-COUNTER STARTING VALUE/ 
H-COUNTER STARTING VALUE REGISTERS 
(R24, R25, R26, AND R27) 

The LCSV and HCSV registers store the start values for the 
L-Counter and H-Counter, respectively. Each time a counter 
reaches zero, the start value is automatically reloaded from 
either LCSV or HCSV, one clock cycle after zero count is 
reached. Loading LCSV or HCSV from the CPU must be 
synchronized to avoid writing the registers while the reload- 
ing of the counters is occurring. One method is to halt the 
counters while the registers are loaded. 

When the 16-bit counters are concatenated, the LCSV and 
HCSV registers hold the 32-bit start count, with the least 
significant byte in R24 and the most significant byte in R27. 


3.19 LCCV/HCCV— L-COUNTER CURRENT VALUE/ 
H-COUNTER CURRENT VALUE REGISTERS 
(R28, R29, R30, AND R31) 

The LCCV and HCCV registers hold the current value of the 
counters. If the CFRZ bit in the MCTL register is reset (0), 
these registers are updated on each clock cycle with the 
current value of the counters. LCCV and HCCV can be read 
only when the counter readings are frozen (CFRZ bit in the 
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TL/EE/51 17-15 

FIGURE 3-3. Recommended ICU’s Initialization Sequence 
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3.0 Architectural 
Description (Continued) 

MCTL register is 1). They can be written only when the 
counters are halted (CRUNL and/or CRUNH bits in the 
CCTL register are 0). This last feature allows new initial 
count values to be loaded immediately into the counters, 
and can be used during initialization to avoid long initial 
counts. 

When the 16-bit counters are concatenated, the LCCV and 
HCCV registers hold the 32-bit current value, with the least 
significant byte in R28 and the most significant byte in R31. 

3.20 REGISTER INITIALIZATION 

Figure 3-3 shows a recommended initialization procedure 
for the ICU that sets up all the ICU registers for proper oper- 
ation. 

4.0 Device Specifications 

4.1 NS32202 PIN DESCRIPTIONS 

4.1.1 Power Supply 

Power (Vcc): + 5 V DC Supply 
Ground (GND): Power Supply Return 

4.1.2 Input Signals 

Reset (RST): Active low. This signal initializes the ICU. (The 
ICU initializes to the 8-bit bus mode.) 

Chip Select (CS): Active low. This signal enables the ICU to 
respond to address, data, and control signals from the CPU. 
Addresses (AO through A4): Address lines used to select 
the ICU internal registers for read/write operations. 

High Byte Enable (HBE): Active low. Enables data trans- 
fers on the most-significant byte of the Data Bus. If the ICU 
is in the 8-bit Bus Mode, this signal is not used and should 
be connected to either GND or Vcc- 
Read (RD): Active low. Enables data to be read from the 
ICU’s internal registers. 

Write (WR): Active low. Enables data to be written into the 
ICU’s internal registers. 


Status (ST1): Status signal from the CPU. When the Hard- 
ware Vector Register is read, this signal differentiates an 
INTA cycle from an RETI cycle. If ST1 = 0 the ICU initiates 
an INTA cycle. If ST1 = 1 an RETI cycle will result. 
Interrupt Requests (IR1, IR3 . . . , IR15): These eight in- 
puts are used for hardware interrupts. Each may be individu- 
ally triggered in one of four modes: Rising Edge, Falling 
Edge, Low Level, or High Level. 

Counter Clock (CLK): External clock signal to drive the ICU 
internal counters. 

4.1.3 Output Signals 

Interrupt Output (TNT): Active low. This signal indicates 
that an interrupt is pending. 

4.1.4 Input/Output Signals 

Data Bus 0-7 (DO through D7): Eight low-order data bus 
lines used in both 8-bit and 16-bit bus modes. 

General Purpose I/O Lines (G0/IR0, G1/IR2 G7/ 

IR14): These pins are the high-order data bits when the ICU 
is in the 16-bit bus mode. When the ICU is in the 8-bit bus 
mode, each of these can be individually assigned one of the 
following functions: 

• Additional Hardware Interrupt Input (IR0 through 
IR14) 

• General Purpose Data Input 

• General Purpose Data Output 

• Clock Output from H-Counter (Pins G0/IR0 through 
G3/IR6 only) 

It should be noted that, for maximum flexibility in assigning 
interrupt priorities, the interrupt positions corresponding to 
pins G0/IR0 G7/IR14 and IR1 IR15 are inter- 

leaved. 

Counter or Oscillator Output/Sampling Clock Input 
(COUT/SCIN): As an output, this pin provides either a clock 
signal generated by the ICU internal oscillator, or a zero 
detect signal from one or both of the ICU counters. As an 
input, it is used for an external clock, to override the internal 
oscillator used for interrupt sampling. This is done only for 
testing purposes. 
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4.0 Device Specifications (Continued) 

4.2 ABSOLUTE MAXIMUM RATINGS 

Temperature Under Bias 0°C to + 70°C 

Storage T emperature - 65°C to + 1 50°C 

All Input or Output Voltages with 
Respect to GND - 0.5V to + 7.0V 

Power Dissipation 1 .5 Watt 


Note: Absolute maximum ratings indicate limits beyond 
which permanent damage may occur. Continuous operation 
at these limits is not intended; operation should be limited to 
those conditions specified under Electrical Characteristics. 


4.3 ELECTRICAL CHARACTERISTICS 

T a = 0° to 70°C, V C c = +5V ± 5%. GND = 0V 


Symbol 

Parameter 

Conditions 

Min 

Typ 

Max 

Units 

V|L 

Input Low Voltage 




0.8 

V 

V| H 

Input High Voltage 


2.0 



V 

Voi 

Output Low Voltage 

Iol = 2 mA 



0.45 

V 

VOH 

Output High Voltage 

Ioh = -400 p.A 

2.4 



V 

II 

Leakage Current 

(Output and I/O Pins in TRI-ST ATE/Input mode) 

0.4 £ V|n ^ Vcc 

-20 


20 

juA 

ll 

Input Load Current 

Vj n = 0 to Vcc 

-20 


20 

juA 

>cc 

Power Supply Current 

O 

c 

II 

p 

H 

II 

O 

6 



300 

mA 


Connection Diagram 


IRIS — 

1 

C7 

40 

Vcc 

inT — 

2 


39 

IR13 

ST1 

3 


38 

IR11 

G7/IR14 — 

4 


37 

IR9 

G6/IR12 

5 


36 

IR7 

GS/IR10 — 

6 


35 

IR5 

G4/IR8 — 

7 


34 

IR3 

G3/IRS — 

8 


33 

IR1 

G2/IR4 — 

9 

N 832202 

ICU 

32 

CLK 

G1/IR2 — 

10 

31 

WR 

GO/IRO — 

11 


30 

RD 

07 — 

12 


29 

— Coui/SCin 

06 — 

13 


28 

— HBE 

05 — 

14 


27 

RSI 

04 — 

15 


26 

A4 

03 — 

16 


25 

— A3 

02 — 

17 


24 

A2 

01 

18 


23 

A1 

DO — 

19 


22 

— A0 

GN0 

20 


21 

— C5 


TOP View TL/EE/5117-3 

Order Number NS32202D-6, NS32202D-10 
See NS Package Number D40C 

FIGURE 4-1 
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4.0 Device Specifications (continued) 

4.4 SWITCHING CHARACTERISTICS 

4.4.1 Definitions Abbreviations: 

All the timing specifications given in this section refer to L.E.— leading edge 

0.8V or 2.0 V on the input and output signals as illustrated in T.E. — trailing edge 

Figure 1, unless specifically stated otherwise. 

5CS TEST raNTS TEST roiNTS 0 8° 

TL/EE/5117-16 

FIGURE 4-2. Timing Specification Standard 

4.4.1. 1 Timing Tables 

R.E. — rising edge 

F.E. — falling edge 


Symbol 

Figure 

Description 

Reference/Conditions 

NS32202-10 

Units 

Min 

Max 

READ CYCLE | 

tAhRDia 

4-3 

Address Hold Time 

After RD T.E. 

10 


ns 

fAsRDa 

4-3 

Address Setup Time 

Before RD L.E. 

35 


ns 

tcShRDia 

4-3 

CS Hold Time 

After RD T.E. 

15 


ns 

tCSsRDa 

4-3 

CS Setup Time 

Before RD L.E. 

30 


ns 

tDhRDia 

4-3 

Data Hold Time 

After RD T.E. 

5 

50 

ns 

fRDaDv 

4-3 

Data Valid 

After RD L.E. 


150 

ns 

tRDw 

4-3 

RD Pulse Width 

At 0.8V (Both Edges) 

160 


ns 

tSsRDa 

4-3 

ST 1 Setup Time 

Before RD L.E. 

35 


ns 

tShRDia 

4-3 

ST1 Hold Time 

After RD T.E. 

-30 


ns 

WRITE CYCLE 

tAhWRia 

4-4 

Address Hold Time 

After WR T.E. 

10 


ns 

UsWRa 

4-4 

Address Setup Time 

Before WR L.E. 

35 


ns 

tCShWRia 

4-4 

CS Hold Time 

After WR T.E. 

15 


ns 

tCSsWRa 

4-4 

CS Setup Time 

Before WR L.E. 

30 


ns 

^DhWRia 

4-4 

Data Hold Time 

After WR T.E. 

10 


ns 

fDsWRia 

4-4 

Data Setup Time 

Before WR T.E. 

70 


ns 

tWRiaPf 

4-4 

Port Output Floating 

After WR T.E. (ToPDIR) 


200 

ns 

tWRiaPv 

4-4 

Port Output Valid 

After WR T.E. 


200 

ns 

twRw 

4-4 

WR Pulse Width 

At 0.8V (Both Edges) 

160 


ns 
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4.0 Device Specifications (Continued) 


4.4.1. 1 Timing Tables (Continued) 


Symbol 

Figure 

Description 

Reference/Conditions 

NS32202-10 

Min | Max 

Units 

OTHER TIMINGS 

tCOUTI 

4-8 

Internal Sampling Clock 

Low Time 

At 0.8V (Both Edges) 

50 


ns 

im 

4-8 

Internal Sampling Clock Period 


400 


ns 

tSCINh 

4-7 

External Sampling Clock High Time 

At 2.0V (Both Edges) 

100 


ns 

tSCINI 

4-7 

External Sampling Clock Low Time 

At 0.8V (Both Edges) 

100 


ns 

IBi 

4-7 

External Sampling Clock Period 


800 


ns 

l Ch 

4-9 

External Clock High Time 
(Without Prescaler) 

At 2.0V (Both Edges) 

100 


ns 

tChp 


External Clock High Time 
(With Prescaler) 

At 2.0V (Both Edges) 

40 


ns 

tci 

4-9 

External Clock Low Time 
(Without Prescaler) 

At 0.8V (Both Edges) 

100 


ns 

tcip 

mm 

External Clock Low Time 
(With Prescaler) 

At 0.8V (Both Edges) 

40 


ns 

tcy 


External Clock Period 
(Without Prescaler) 


400 


ns 

tCyp 

mm 

External Clock Period 
(With Prescaler) 


100 


ns 

tGCOUTI 

4-9 

Counter Output Transition Delay 

After CLK F.E. 


300 

ns 

tCOUTw 

mm 

Counter Output Pulse 

Width in Pulsed Form 

At 0.8V (Both Edges) 

50 


ns 

^ACKIR 

H 

Interrupt Request Delay 

After Previous Interrupt 
Acknowledge 

500 


ns 

t|Rld 


INT Output Delay 

After Interrupt 

Request Active 


800 

ns 

t|Rw 


Interrupt Request Pulse 

Width in Edge Trigger 

At 0.8V (Both Edges) 

50 


ns 


RST Pulse Width 


tRSTw 


At 0.8V (Both Edges) 


400 


ns 
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4.0 Device Specifications (Continued) 



TL/EE/51 17-21 


FIGURE 4-7. External Interrupt-Sampling-Clock to be Provided at Pin COUT/SCIN When In Test Mode 



FIGURE 4-8. Internal Interrupt-Sampling-Clock Provided at Pin COUT/SCIN 


tci 

OR OR 



TL/EE/51 17-23 

FIGURE 4-9. Relationship Between Clock Input at Pin CLK and Counter Output Signals at Pins COUT/SCIN or 
G0/R0,...,G3/R6, in Both Pulsed Form and Square Waveform 
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National 

Semiconductor 


PRELIMINARY 


NS32203-10 Direct Memory Access Controller 


General Description 

The NS32203 Direct Memory Access Controller (DMAC) is 
a support chip for the Series 32000® microprocessor family 
designed to relieve the CPU of data transfers between 
memory and I/O devices. The device is capable of packing 
data received from 8-bit peripherals into 16-bit words to re- 
duce system bus loading. It can operate in local and remote 
configurations. In the local configuration it is connected to 
the multiplexed Series 32000 bus and shares with the CPU, 
the bus control signals from the NS32201 Timing Control 
Unit (TCU). In the remote configuration, the DMAC, in con- 
junction with its own TCU, communicates with I/O devices 
and/or memory through a dedicated bus, enabling rapid 
transfers between memory and I/O devices. The DMAC 
provides 4 16-bit I/O channels which may be configured as 
two complementary pairs to support chaining. 


Features 

■ Direct or Indirect data transfers 

■ Memory to Memory, I/O to I/O or Memory to I/O 
transfers 

■ Remote or Local configurations 

■ 8-Bit or 16-Bit transfers 

■ Transfer rates up to 5 Megabytes per second 

■ Command Chaining on complementary channels 

■ Wide range of channel commands 

■ Search capability 

■ Interrupt Vector generation 

■ Simple interface with the Series 32000 Family of 
Microprocessors 

■ High Speed XMOStm Technology 

■ Single + 5V Supply 

■ 48-Pin Dual-In-Line Package 


Block Diagram 



TL/EE/8701-1 
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1.0 Product Introduction 

The NS32203 Direct Memory Access Controller (DMAC) is 
specifically designed to minimize the time required for high 
speed data transfers in a Series 32000-based computer 
system. It includes a wide variety of options and operating 
modes to enhance data throughput and system optimiza- 
tion, and to allow dynamic reconfiguration under program 
control. 

The NS32203 can operate in two basic system configura- 
tions: local and remote. In the local configuration, the DMAC 
and the CPU share the same bus (address,- data and con- 
trol) and only one of them can perform data transfers on the 
bus at any one time. In this configuration, the DMAC and the 
CPU also share a Timing Control Unit (TCU) and a single set 
of address latches. Since this configuration yields a mini- 
mum part-count system, it offers a good cost/performance 
trade-off in many situations. 

The remote configuration is intended to minimize the CPU 
bus use. In this configuration, the NS32203 I/O devices and 
optional buffer memory have their own dedicated bus (re- 
mote bus) so that an I/O transfer may be performed without 
loading the CPU bus (local bus). 

Communication between the dedicated bus and the CPU 
bus may be initiated at any time by either the CPU or the 
NS32203. The DMAC accesses the CPU bus whenever a 
data transfer to/from memory or any I/O device residing on 
this bus is to be performed. The CPU, in turn, accesses the 
dedicated bus for reading status data or for programming 
either the DMAC or its I/O devices. 

The NS32203 internal organization consists of seven func- 
tional blocks as illustrated in the block diagram. Descrip- 
tions of these blocks are given below. 

DMA Channels. The NS32203 provides four channels. 
Each channel accepts a request from a peripheral I/O de- 
vice and informs it when data transfer cycles are about to 


begin. A set of registers is provided for each channel to 
control the type of operation for that channel. 

Bus Interface Unit. The bus interface unit controls all data 
transfers between peripheral I/O devices and memory 
whenever the DMAC is in control of the bus. This unit also 
controls the transfer of data between the CPU and the 
DMAC internal registers. 

Timing and Control Logic. This block generates all the 
sequencing and control signals necessary for the operation 
of the DMAC. 

Priority Resolver. This block resolves contentions among 
channels requesting service simultaneously. 

2.0 Functional Description 

2.1 RESETTING 

The RST/HLT line serves both as a reset input for the on- 
chip logic and as a DMAC H ALT input. Resetting is accom- 
plished by pulling RST/HLT low for at least 64 clock cycles. 
Upon detecting a Reset, the DMAC terminates any Data 
transfer in progress, resets its internal l ogic and enters an 
inactive state. On application of power, RST/HLT must be 
held low for at least 50 /xs after Vqc is stable. This is to 
ensure that all on-chip voltages are stable before operation. 
Whenever reset is applied, the rising edge must occur while 
the clock signal on the CLK pin is high (see Figure 2-1 and 
2-2). The NS32201 TCU provides circuitry to meet the reset 
requirements. Figure 2-3 shows the recommende d connec- 
tions. The HALT function is accomplished when RST/HLT 
is activated for 1 or 2 clock cycles and then released. It can 
be used to stop any data transfer in progress in case of a 
bus error. As soon as HALT is acknowledged by the 
NS32203, the current transfer operation is terminated. See 
Figure 4-18. 
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FIGURE 2-1. Power-On Reset Requirements 
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2.0 Functional Description (Continued) 
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FIGURE 2-2. General Reset Timing 
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2.2 DATA TRANSFER OPERATIONS 

After the NS32203 has been initialized by software, it is 
ready to transfer blocks of data, containing up to 64 kbytes, 
between memory and I/O devices, without further interven- 
tion required of the CPU. Upon receiving a transfer request 
from an I/O device, the DMAC performs the following oper- 
ations: 

1 ) Acquires control of the bus 

2) Acknowledge the requesting I/O device which is con- 
nected to the highest priority channel. 

3) Starts executing data transfer cycles according to the val- 
ues stored into the control registers of the channel being 
serviced. 

4) Terminates data transfers and relinquishes control of the 
bus as soon as one of the programmed conditions is met. 


Each channel can be programmed for indirect or direct data 
transfers. Detailed descriptions of these transfer types are 
provided in the following sub-sections. 

2.2.1 Indirect Data Transfers 

In this mode of operation, each byte or word transfer be- 
tween source and destination requires at least two bus cy- 
cles. The data is first read into the DMAC and subsequently 
it is written into the destination. The bus cycles in this case 
are similar to the CPU bus cycles when the MMU is not 
used. This mode is slower than the direct mode, but is the 
only one that allows some data manipulation like Byte 
Search or Word Assembly/Disassembly. Figure 2-4 and 2-5 
show the read and write cycle timing diagrams related to 
indirect data transfers. If a search operation is specified, 
extra clock cycles may be added following each read cycle. 



4-31 


NS32203-10 



NS32203-10 


2.0 Functional Description (Continued) 



FIGURE 2-4. Indirect Read Cycle 
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2.0 Functional Description (Continued) 
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FIGURE 2-5. Indirect Write Cycle (Single Transfer Mode) 

Note: If burst mode is selected, HOLD is released at the end of the transfer operation. 
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2.0 Functional Description (Continued) 

2.2.2 Direct (Flyby) Data Transfers 

This mode of operation allows a very high data transfer rate 
between source and destination. Each data byte or word to 
be transferred requires only a single bus cycle instead of 
two separate read and write cycles, which are typical of the 
indirect mode. Th e DMA C accomplishes direct data trans- 
fers b y activating IORD, during memory write cycles, and 
IOWR, during memory read cycles. 

An I/O device, in the direct m ode, is usually enabled by the 
proper acknowledge signal (ACKn) from the DMAC. No 
search or word assembly/disassembly are possible during 


direct data transfers. Figures 2-6 and 2-7 show the timing 
diagrams of direct memory-to-l/O and l/O-to-memory trans- 
fers respectively. 

Note 1: In the direct mode each channel can control only one I/O device 
because the I/O device is hardwired to the ACKn output of the 
corresponding channel, in the indirect mode, a channel can control 
multiple devices as long as each device is selected through its own 
address rather than the ACKn output. However, the possiblity of 
selecting a single I/O device by the ACKn output is maintained in 
the indirect mode as well. 

Note 2: Whenever the DMAC is either idle or is performing indirect transfers, 
it generates the IORD and IOWR signals as a replica of RD and WR. 
This simplifies the logic required to access I/O devices wired for 
direct data transfers. 



FIGURE 2-6. Direct Memory-To-I/O Data Transfer (Single Transfer Mode) 
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2.0 Functional Description (Continued) 

2.3 LOCAL CONFIGURATION 

As previously mentioned, in the local configuration the 
DMAC shares with CPU and MMU the multiplexed address/ 
data bus as well as the control signals from the NS32201 
TCU. A typical local configuration is shown in Figure 2-8. 
The DMAC, in the local configuration, must gain control of 
the bus whenever a data transfer cycle is to be performed, 


even though it is directed to an I/O device and is related to 
an indirect data transfer. This causes the system to be quite 
sensitive to the volume of data handled by the DMAC. Thus, 
the overall system performance decreases as the volume of 
data increases. A possible solution to this problem is to use 
the remote configuration, described in the following section. 
A significant advantage of the local configuration is its sim- 
plicity. 
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FIGURE 2-7. Direct l/O-To-Memory Data Transfer (Single Transfer Mode) 
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FIGURE 2-8. NS32203 Interconnections In Local Configuration 

Note 1: The 16 Bit I/O device is wired for direct transfers. 

Note 2: The data buffers should not be enabled during direct data transfers or CPU accesses to the DMAC registers. 



2.0 Functional Description (Continued) 

2.4 REMOTE CONFIGURATION 

The remote configuration is intended to minimize CPU Bus 
usage. In this configuration, the DMAC, buffer memory and 
I/O devices reside on a dedicated bus. Communication be- 
tween the dedicated bus and the CPU bus is achieved by 
means of TRI-STATE buffers. Whenever the CPU needs to 
access the dedicated bus, It issu es a bus request to the 
NS32203 by activating the BREQ signal. As the dedicated 
bus becomes idle, the DMAC pulls off the bu s and acknowl- 
edges the CPU request by activating BGRt. This output is 
also used as a control signal for the interconnection logic of 
the two buses. 


The CPU can either be interrupted by BGRt or it can poll 
BGRT to determine when the dedicated bus can be ac- 
cessed. The DMAC, in turn, before accessing the CPU bus, 
has to gain control of it. This is accomplished through the 
usual reque st-ackn owle dge m echanism performed by 
means of the HOLD and hLDA signals. 

Figure A- 1 in Appendix A shows an interconnection diagram 
of a basic remote configuration. Both TCUs are clocked by 
the sa me clo ck signal. They are synchronized during reset 
by the RWEhl/SYNC signal so that their output clocks are in 
phase. Figures 2-9 and 2-10 show the timing diagrams for 
read and write accesses to the NS32203 internal registers. 



FIGURE 2-9. Write to NS32203 Internal Registers 
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2.0 Functional Description (Continued) 

2.5 DATA SOURCE (DESTINATION) ATTRIBUTES 

Two types of data source (destination) are recognized: I/O 
device and memory. If the source (destination) is an I/O 
device, its address register is not changed after a data 
transfer; if it is memory, its address register is either incre- 
mented or decremented after any data transfer, according 
to the value of the corresponding direction bit. In the remote 
configuration, any data source (destination) may reside ei- 
ther on the CPU bus or on the dedicated bus. If it resides on 
the de dicated bus, the NS32203 does not activate the 
HOLD request line when an access to the source (destina- 
tion) is performed, unless a direct transfer with a data desti- 
nation (source) residing on the CPU bus is required. 

Data can be transferred in either 8 bit or 16 bit units. The 
DMAC always considers the memory to be 16 bits wide. 
Thus, if an 8 bit transfer is specified, address bit AO will 
determine the byte of the data-bus where the transfer takes 
place. If AO = 0, the transfer occurs on the low order byte. 
If AO = 1 , it occurs on the high order byte. Different transfer 
widths can be specified for source and destination. Howev- 
er, some limitations exist in specifying these transfer widths 
when certain operations must be performed. These limita- 
tions are explained below. 

1) If a transfer block has an odd number of bytes or is not 
word aligned, an 8 bit width for source and destination 
should be selected. 

2) 16-bit I/O transfers can not be specified with 8 bit 
memory transfers. 

3) Memory to memory transfers should have the same 
width. 

Note 1: If source and destination are both memory, DMAC transfers can 
only be performed in indirect mode. 

Note 2: If source and destination are both I/O device s and dire ct mo de is 
being used, the source device is accessed by IORD and ACKn; the 
destination device is accessed by WR (from the NS32201) and CS 
(from the address decoder). This allows a one direction data trans- 
fer only from one I/O device (source) to another. If data is to be 
transferred in both directions in direct mode between two I/O devic- 
es. two channels must be used (one for each direction of transfer), 
and extra hardware is required to control the read and write signals 
to the two I/O devices. 

Note 3: When an 8-bit transfer is related to an I/O device, the other half of 
the 16-bit data bus is considered as DON’T CARE, and the HBE/ 
signal may be activated. 

2.6 WORD ASSEMBLY/DISASSEMBLY 

This feature is automatically enabled when indirect transfers 
are selected, with data transferred between an 8-bit wide 
I/O device and a 1 6-bit I/O device or memory. For every 1 6- 
bit I/O device or memory access, the DMAC accesses the 
8-bit I/O device twice, assembling two data bytes into a 16- 
bit word or breaking a 16-bit word into two data bytes, de- 
pending on the direction of transfer. The word assem- 
bly/disassembly feature allows a significant increase in the 
transfer speed and minimizes the CPU bus usage when the 
transfer occurs between an 8-bit I/O device residing on the 
dedicated bus, and a 16-bit I/O device or memory residing 
on the CPU bus. Word assembly/disassembly is not possi- 
ble during direct data transfers. 

Note: Requests from other channels are not acknowledged in the middle of 
a word assembly/disassembly. If this is unacceptable, 8 bit transfers 
should be specified for both source and destination. 


2.7 AUTO TRANSFER 

The NS32203 initiates a data transfer as a result of a re- 
quest from an I/O device. In some cases a data transfer 
may be necessary without the corresponding request signal 
being asserted. This can happen, for example, when a block 
of data is to be moved from one memory region to another. 
In such cases, the auto transfer mode can be selected by 
setting an appropriate bit in the command registe r. Th e 
DMAC will initiate a data transfer regardless of the REQn 
signal for that channel. 

Note: For proper operation, when auto transfer is required, the low order 
byte of the command register (containing the auto-transfer enable bit) 
should be written into after the other registers controlling the channel 
operation have been initialized. 

2.8 SEARCH 

The NS32203 provides a search capability that can be used 
to detect the occurrence of a certain data pattern. The 
search is performed by comparing each data byte with the 
search register, in conjunction with the mask register. An 
appropriate bit in the command register indicates whether 
the search continues ‘UNTIL’ a match occurs, or ‘WHILE’ a 
match exists. The search operation does not necessarily 
involve a data transfer. The DMAC allows a block of data to 
be searched without requiring any data transfer between 
source and destination. When performing a search, the user 
can specify whether or not the matched byte will be trans- 
ferred. If ‘INCLUSIVE SEARCH’ is specified (INC = 1), the 
matched byte will be transferred, and the channel parame- 
ters will be updated accordingly. In this case, if a 16 bit word 
has been read from the data source and the search condi- 
tion is satisfied by the low order byte, then the high order 
byte is transferred as well. If ‘EXCLUSIVE SEARCH’ is 
specified (INC = 0), the transfer will terminate with the last 
byte before the search condition was satisfied, and the pa- 
rameters will point to the last transferred byte. 

Search is not possible during direct transfers. 

2.9 INTERRUPTS 

The NS32203 provides interrupt circuitry that can be used to 
generate an interrupt whenever a data transfer is completed 
or a search condition is met. If an NS32202 ICU is used, the 
INT signal from the DMAC should be connected to an inter- 
rupt input of the ICU. When an interrupt occurs and the 
corresponding interrupt acknowledge (INTA) or return from 
interrupt (RETI) cycle is executed by the CPU, the NS32203 
supplies its own vector as if it were a cascaded ICU. For 
such operation the virtual address of the interrupt vector 
register should be placed in the ICU cascade table, de- 
scribed in the NS32016 and NS32202 data sheets. See 
section 3.1.2. 

2.10 TRANSFER MODES 

When the NS32203 is in the inactive state and a channel 
requests service, the DMAC gains control of the bus and 
enters the active state. It is in this state that the data trans- 
fer takes place in one of the following modes: 

SINGLE TRANSFER MODE 

In single transfer mode, the NS32203 mak es a single byte 
or word transfer for each HOLD/HLDA handshake se- 
quence. 

In this case the request signal from the I/O device is edge 
sensitive, that is, a single transfer is performed each time a 
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2.0 Functional Description (Continued) 

falling edge on REQn occurs. To perform multipl e tran sfers, 
it is therefore necessary to temporarily deassert REQn after 
each transfer is initiated. If auto transfer mode is selected, 
the bus is released between two transfers for at least one 
clock cycle. 

BURST (DEMAND) TRANSFER MODE 

In burst transf er mod e the DMAC will continue making data 
transfers until REQn goes inactive. Thus, the I/O device 
reques ting service may suspend data transfer by b ringing 
REQn inactive. Service may be resumed by asserting REQn 
again. If the auto transfer mode is selected, the DMAC will 
perform a single burst of data transfers until the end-transfer 
condition is reached. 

Note 1: In either of the transfer modes described above, data transfers can 
only occur as long as the byte count is not zero or a search condi- 
tion is not met. Whenever any of these conditions occur, the 
NS32203 terminates the current operation and releases the bus for 
at least one clock cycle. 

Note 2: Whenever the DMAC releases HOLD, it waits for HLDA to go inac- 
tive for at least one clock cycle before reasserting HOLD again to 
continue the transfer operation. 

2.11 CHAINING 

The NS32203 provides a chaining feature that allows the 
four DMAC channels to be regarded as two complementary 
pairs. Channels 0 and 1 form the first pair, while channels 2 
and 3 form the second pair. Each pair is programmed inde- 
pendently by setting the corresponding bit in the configura- 
tion register. When two channels are complementary, only 
the even channel can perform transfer operations, while the 
odd one serves as temporary storage for the new control 
values and parameters loaded for the chaining operation. If 
an operation is being performed by the even channel of a 
pair and an end-condition is reached, the channel is not 
returned to the inactive state; rather, a new set of control 
values with or without parameters is loaded from the com- 
plementary channel and a new operation is started. During 
the reload operation the bus is released for at least two 
clock cycles. At the end of the second operation the chan- 
nel returns to the inactive state, unless a new set of values 
has been loaded into the complementary channel by the 
CPU. 

The chaining feature can be used to transfer blocks of data 
to/from non-contiguous memory segments. For example, 
the CPU can load channel 0 and 1 with control values and 
parameters for the first two blocks. After the operation for 
the first block is completed by channel 0, the control values 
and parameters stored in channel 1 are transferred to chan- 
nel 0, during an update cycle, and a second operation is 
started. The CPU, being notified by an interrupt, can load 
channel 1 registers with control values and parameters for 
the third data block. 

Note 1: Whenever a reload operation occurs, the register values of the com- 
plementary channel are affected. Thus, the CPU must always load a 
new set of values into the complementary channel if another chain- 
ing operation is required. 

Note 2: When the chain option is selected, the CPU must be given the op- 
portunity to acquire the bus for enough time between DMAC opera- 
tions, in order for the complementary channel to be updated. 

2.12 CHANNEL PRIORITIES 

The NS32203 has four I/O channels, each of which can be 
connected to an I/O device. Since no dependency exists 
between the different I/O devices, a priority level is as- 
signed to each I/O channel, and a priority resolver is provid- 
ed to resolve multiple requests activated simultaneously. 


The priority resolver checks the priorities on every cycle. If a 
channel is being serviced and a higher priority request is 
received, the channel operation is suspended and control 
passes to the higher priority channel, unless the lock bit for 
the lower priority channel is set. If the lock bit is set, that 
channel operation is continued until completion before con- 
trol passes to the higher priority channel. The bus is always 
released for at least two clock cycles when control passes 
from one channel to another. 

Two types of priority encodings are available as software 
selectable options. 

The first is fixed priority which fixes the channels in priority 
order based on the decreasing values of their numbers. 
Channel 3 has the lowest priority, while channel 0 has the 
highest. 

The second option is variable priority. The last channel that 
receives service becomes the lowest priority channel 
among all other channels with variable priority, while the 
channels which previously had lower priority will get their 
priorities increased. If variable priority is selected for all four 
channels, any I/O device requesting service is guaranteed 
to be acknowledged after no more than three higher priority 
services have occurred. This prevents any channel from 
monopolizing the system. Priority types can be intermixed 
for different channels. 

As an example, let channels 0, 2 and 3 have variable priority 
and channel 1 fixed priority. Channel 2 receives service first, 
followed by channel 0. The priority levels among all chan- 
nels will change as follows. 

Priority Initial Order Next Order Final Order 

High 3 ch.O ACK — * ch.O ch.3 

2 ch.1 ch.1 ch.1 — > fixed priority 

1 ACK—* ch.2 ch.3 ch.2 

Low 0 ch.3 ch.2 ch.O 

Whenever the PT bit (priority type) in the command register 

is changed, the priority levels of all the channels are reset to 
the initial order. If only one channel has variable priority, 
then no change in priority will occur from the initial order. 
Note: If the lock bit is not set, three idle states are inserted between the 
write cycle of a previous burst indirect transfer and the next read 
cycle. 

3.0 Architectural Description 

The NS32203 has 128 8-bit registers that can be addressed 
either individually or in pairs, using the 7 least significa nt bits 
of the address bus and the high byte enable signal HBE. 
Seventy-one of these registers are reserved, while the rest 
are accessible by the CPU for read/write operations. Figure 
3-1 shows the NS32203 internal registers together with their 
address offsets. Detailed descriptions of these registers are 
given in the following sections. 

3.1 GLOBAL REGISTERS 

The global registers consist of one configuration, one status 
and two interrupt vector registers. They are shared by all 
channels, and they control the overall operation of the 
NS32203. 

3.1.1 CONF — Configuration Register 

This register controls the hardware configuration of the 
NS32203 as well as the chaining feature. 
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3.0 Architectural Description (Continued) 

The CONF register format is shown below: 

7 6 5 4 3 2 1 0 



CNF — Configuration Bit. Determines whether the 
NS32203 is in local or remote configuration. 

CNF = 0 = > Local Configuration 
CNF = 1 = > Remote Configuration 

CO— Chaining bit for channels 0 and 1. Determines 
whether or not channel 0 and 1 are complementa- 
ry. 


CO = 0 = > Channels not complementary 
CO = 1 = > Channel 1 complementary to chan- 
nel 0 

Cl — Chaining bit for channels 2 and 3. Determines 
whether or not channels 2 and 3 are complemen- 
tary. 

Cl = 0 = > Channels not complementary 
Cl = 1 = > Channel 3 complementary to chan- 
nel 2 

XXXXX — Reserved. These bits should be set to 0. 

At reset, all CONF bits are reset to zero. 

Note: The CNF bit should never be set by the software if the DMAC is wired 
for local configuration, otherwise bus conflicts will result. 
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3.0 Architectural Description (Continued) 

3.1.2 HVCT — Hardware Vector Register 

This register contains the interrupt vector byte that is sup- 
plied to the CPU during an interrupt acknowledge (INTA) or 
return from interrupt (RETI) cycle. The HVCT register format 
is shown below. 


7 6 5 4 3 2 1 0 


BIAS 


E CN 


CN — Channel number. Represents the number of the in- 
terrupting channel 

E — Error code. Determines whether a normal operation 
completion or an error condition has occurred on 
the interrupting channel. 

E =0 = > Normal Operation Completion 
E = 1 = > A second interrupt was generated by 
the same channel before the first inter- 
rupt was serviced. 

BIAS — Programmable bias. This field is programmed by 
writing the pattern BBBBB000 into the HVCT regis- 
ter. 


The NS32203 always interprets a read of the HVCT register 
as either an interrupt acknowledge (INTA) cycle or a return 
from interrupt (RETI) cycle. Since these cycles cause inter- 
nal changes to the DMAC, normal programs should never 
read the HVCT register (see next section). The DMAC dis- 
tinguishes an INTA cycle from a RETI cycle by the state of 
an internal flip-flop, called Interrupt Service Flip-Flop, that 
toggles every time the HVCT register is read. This flip-flop is 
cleared on reset or when the HVCT register is written i nto. 
When an interrupt is acknowledged by the CPU, the I NT 
signal is deasserted unless another interrupt from a lower 
priority channel is pending. In this case the INT signal is 
deasserted when the acknowledge cycle for the second in- 
terrupt is performed. 

For this reason, if the IN? signal is connected to an interrupt 
input of the NS32202 ICU, the triggering mode of that inter- 
rupt position should be ‘low level’. 

Furthermore, if that ICU interrupt input is programmed for 
cascaded operation and nesting of interrupts from other de- 
vices connected to the ICU is to be allowed, then the ICU 
interrupt input connected to the DMAC should be masked 
off during the interrupt service routine, before the CPU inter- 
rupt is reenabled. This is because the DMAC does not pro- 
vide interrupt nesting capability. 


An interrupt from a certain channel can be acknowledged 
only after the return from interrupt from a previously ac- 
knowledged interrupt is performed. 
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channel #3 channel #2 channel #1 channel #0 
The status of each channel is defined in a four-bit field as 
described below: 

TC — Transfer Complete. 

Indicates the completion of a channel operation, re- 
gardless of the state of the length register or whether 
a match/no match condition occurred. 

MN — Match/No Match Bit. 

This bit is set when a match/no match condition oc- 
curs. 

CH — Channel Halted. 

Set when a channel operation is halted by pulling the 
RST/HLT pin. 

ME — Multiple events. This bit is set when more than one of 
the above conditions have occurred. 

Note: If an interrupt is enabled, the corresponding bit in the status register is 
not cleared upon read, unless the interrupt is acknowledged. 

3.2 CONTROL REGISTERS 

Each of the four channels has three control registers, con- 
sisting of a 24-bit command register, an 8-bit search register 
and an 8-bit mask register. 

3.2.1 COM — Command Register 

The command register controls the operation of the associ- 
ated channel. It is divided into three separately addressable 
parts: COM(L), COM(M) and COM(H). The format of each 
part and bit functions are shown below. 

COM(L) — Command Register (Low-Byte) 
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CC— Command Code 

CC =00 = > Channel Disabled. 

CC =01 = > Search 
CC =10 = > Data Transfer 
CC = 1 1 = > Data Transfer and Search 
Dl — Direct/Indirect T ransfers 

Dl =0 => Indirect Transfers 
Dl =1 => Direct Transfers 
INC — Inclusive/Exclusive Search 

INC =0 => Exclusive Search 
INC = 1 = > Inclusive Search 


3.1.3 SVCT — Software Vector Register 

The SVCT register is an image of the HVCT register. It is a 
read-only register used for diagnostics. It allows the pro- 
grammer to read the interrupt vector without affecting the 
interrupt logic of the NS32203. The format of the SVCT reg- 
ister is the same as that of the HVCT register. 

3.1.4 ST AT — Status Register 

The status register contains status information of the 
NS32203, and can be used when the interrupts are not en- 
abled. Each set bit is automatically cleared when a read 
operation is performed. The format of this register is shown 
in the following figure. 


UW — Search type 

UW =0 => Search UNTIL 
UW =1 => Search WHILE 
PT — Priority type 

PT = 0 = > Fixed 
PT =1 => Variable 
LK — Priority lock 

LK = 0 = > Priority Unlocked 
LK = 1 = > Priority Locked 
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3.0 Architectural Description (Continued) 

AT — Auto transfer 

AT =0 = > Auto Transfer Disabled 
AT =1 = > Auto Transfer Enabled 
At Reset, the CC bits in COM(L) are cleared, disabling the 
channel. 

Note: The CC bits can be cleared by software during an indirect data trans- 
fer to stop the transfer. This, however, should not be done during 
direct data transfers. See section 3.3.3. 

COM(M) - Command Register (Middle-Byte) 


7 

6 

5 

4 

3 

2 

1 

0 

DD 

DW 

DL 

DT 

SD 

SW 

SL 

ST 


ST — Source Type 

ST =0 = >1/0 Device 
ST =1 => Memory 
SL — Source Location 

(Effective only in the remote configuration) 
SL = 0 = > Local 
SL = 1 = > Remote 
SW — Source Width 

SW =0 = > 8 Bits 
SW =1 =>16 Bits 
SD — Source Direction 
SD =0 = >Up 
SD = 1 = > Down 
DT — Destination Type 

DT =0 = > I/O Device 
SD =1 => Memory 
DL — Destination Location 

(Effective only in the remote configuration) 
DL =0 = > Local 
DL = 1 = > Remote 
DW — Destination Width 
DW =0 = > 8 Bits 
DW =1 => 16 Bits 
DD — Destination Direction. 

DD =0 => Up 
DD =1 =>Down 

COM(H) - Command Register (High-Byte) 


7 

6 

5 

4 3 

2 

1 

0 

HLI 

MNI 

TCI 

AMN 

ATC 

DM 

S 


X — Reserved. (Should be set to 0) 


TM — Transfer Mode 

DM = 0 = > Single Transfer 
DM =1 => Burst Transfer 
ATC — Action after T ransfer Complete 
ATC =0 = > Disable Channel 
ATC = 1 = > Load Control Values and Parame- 
ters from Complementary Channel 
and Continue 


AMN — Action after Match/No Match 

AMN =00 = > Disable Channel 
AMN =01 => Continue 

AMN = 10 = > Load Control Values from Comple- 
mentary Channel and Continue 
AMN = 1 1 = > Load Control Values and Parame- 
ters from Complementary Channel 
and Continue 

TCI — Interrupt Mask on “Transfer Complete” 

TCI =0 = > No Interrupt 
TCI = 1 = > Interrupt 

MNI — Interrupt Mask on “Match/No Match” 

MNI =0 = > No Interrupt 
MNI =1 => Interrupt 

HU — Interrupt Mask on “Channel Halted” 

HLI =0 = > No Interrupt 
HLI = 1 = > Interrupt 

3.2.2 SRCH — Search Register 

This 8-bit register holds the value to be compared with the 
data transferred during the channel operation. 

3.2.3 MSK — Mask Register 

The 8-bit mask register determines which bits of the trans- 
ferred data are compared with corresponding search regis- 
ter bits. If a mask register bit is set to 0, the corresponding 
search register bit is ignored in the compare operation. At 
reset, all the MSK bits are set to 0. 

3.3 PARAMETER REGISTERS 

Each channel has three parameter registers, consisting of a 
24-bit source address register, a 24-bit destination address 
register and a 1 6-bit block length register. 

3.3.1 SRC — Source Address Register 

The source address register points to the physical address 
of the data source. When the data source is an I/O device, 
the register does not change during the transfer operation. 
When the data source is memory, the register is increment- 
ed or decremented by either one or two after each transfer. 

3.3.2 DST — Destination Address Register 

The destination address register points to the physical ad- 
dress of the data destination. When the data destination is 
an I/O device, the register does not change during the 
transfer operation. When the data destination is memory, 
the register is incremented or decremented by either one or 
two after each transfer. 

3.3.3 LNGT — Block Length Register 

The block length register holds the number of bytes in the 
block to be transferred. It is decremented by either one or 
two after each transfer. 

Note: A direct data transfer can be stopped by writing zeroes into the LNGT 
register. The number of bytes transferred can be determined in this 
case, from the value of either the SRC or the DST register. 
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4.0 Device Specifications 

4.1. NS32203 PIN DESCRIPTIONS 

The following is a brief description of all NS32203 pins. The 
descriptions reference portions of the Functional Descrip- 
tion, Section 2.0. 


Connection Diagram 


A22 
A21 
A20 
A19 
A18 
A17 
A16 
ADI 5 
ADU 
ADI 3 
ADI 2 
AD1 1 
AD10 
AD9 
AD8 
AD7 
AD6 
AD5 
AD4 
AD3 
AD2 
ADI 
ADO 
GND 


TL/EE/8701-12 

Top View 



FIGURE 4-1. NS32203 DuaMn-Line Package 

Order Number NS32203D or NS32203N 
See NS Package Number D48A or N48A 


Chip Select (CS): When low, the device is selected, en- 
abling CPU access to the DMAC internal registers. 

Ready (RDY): Active high. When inactive, the DMA Control- 
ler extends the current bus cycle for synchronization with 
slow memory or peripherals. Upon detecting RDY active, 
the DMAC terminates the bus cycle. 

Channel Request 0-3 (REQO - REQ3): Active low. These 
lines are used by peripheral devices to request DMAC serv- 
ice. 

Bus Request (BREQ): Used only in the remote configura- 
tion. This signal, when asserted, forces the DMAC to stop 
transferring data and to release the bus. It must be activated 
by the CPU before any CPU access to the remote bus is 
performed. In the local configuration this signal should be 
connected to Vcc via a 4.7k resistor. Section 2.4. 

Hold Acknowledge (HLDA): Active low. When asserted, 
indicates that control of the system bus has been relin- 
quished by the current bus master and the DMAC can take 
control of the bus. 

Clock (CLK): Clock signal supplied by the CTTL output of 
the NS32201 TCU. 

4.1.3 OUTPUT SIGNALS 

Address Bits 16-23 (A16-A23): Most significant 8 bits of 
the address bus. 

Hold Request (HOLD): Active low. Used by the DMAC to 
request control of the system bus. 

Channel Acknowledge 0-3 (ACKO - ACK3): These lines 
indicate that a channel is active. When a channel’s request 
is honored, the corresponding acknowledge line is activated 
to notify the peripheral device that it has been selected for a 
transfer cycle. Section 2.2.2. 

Bus Grant (BGRT): Used only in the remote configuration. 
This signal is used by the DMAC to inform the CPU that the 
remote bus has been relinquished by the DMAC and can be 
accessed by the CPU. Section 2.4. 

I/O Read (IORD): Active low. Enables data to be read from 
a peripheral device. Section 2.2.2. 

I/O Write (IOWR): Active low. Enables data to be written to 
a peripheral device. Section 2.2.2. 

Interrupt (INT): Active low. Used to generate an interrupt 
request when a programmed condition has occurred. Sec- 
tion 2.9. 


4.1.1 SUPPLIES 

Power (V cc ): + 5V positive supply. 

Ground (GND): Ground reference for on-chip logic. 

4.1.2 INPUT SIGNALS 

Reset/Halt (RST/HLT): Active low. If held active for 1 or 2 
clock cycles and released, this signal halts the DMAC oper- 
ation on the active channel. If held longer, it resets the 
DMAC. Section 2.1. 


4.1.4 INPUT/OUTPUT SIGNALS 

Address/Data 0-15 (ADO- ADI 5): Multiplexed Address/ 
Data bus lines. Also used by the CPU to access the DMAC 
internal registers. 

High Byte Enable (HBE): Active low. Enables data trans- 
fers on the most significant byte of the data bus. 

Address Strobe (ADS): Active low. Controls address latch- 
es and indicates the start of a bus cycle. 

Data Direction in (DDIN): Active low. Status signal indicat- 
ing the direction of data flow in the current bus cycle. 
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4.0 Device Specifications (continued) 

4.2 ABSOLUTE MAXIMUM RATINGS 

If Military/Aerospace specified devices are required, Note: Absolute maximum ratings indicate limits beyond 

please contact the National Semiconductor Sales which permanent damage may occur. Continuous operation 

Office/Distributors for availability and specifications. at these limits is not intended; operation should be limited to 

Temperature Under Bias 0°C to + 70°C those conditions specified under Electrical Characteristics. 

Storage Temperature -65°Cto +150°C 

All Input or Output Voltages with 

Respect to G N D - 0.5V to + 7 V 

Power Dissipation 1.1 Watt 

4.3 ELECTRICAL CHARACTERISTICS T a = Oto +70°C, V C c = 5V ±5%, GND = 0V 

Symbol 

Parameter 

Conditions 

Min 

Typ 

Max 

Units 

V| H 

High Level Input Voltage 


2.0 


Vcc + 0.5 

V 

V|L 

Low Level Input Voltage 


-0.5 


0.8 

mm 

VoH 

High Level Output Voltage 

Ioh = — 400 ju.A 

MM 



mm 

VOL 

Low Level Output Voltage 

Iol = 2 mA 



0.45 

V 

— 

Input Load Current 

0 < V|n ^ Vcc 

-20 


20 

)j,A 

m 

Leakage Current 

Output and I/O Pins in TRI-STATE/Input Mode 

0.4 ^ V|n ^ Vqc 

-20 

■ 

20 

ju.A 

>cc 

Active Supply Current 

kxJT = o, Ta = 25°C 


180 

300 

mA 

4.4 SWITCHING CHARACTERISTICS 

4.4.1 Definitions 

All the timing specifications given in this section refer to 
0.8V and 2.0V on all the input and output signals as illustrat- 
ed in Figures 4-2 and 4-3, unless specifically stated other- 
wise. 

ABBREVIATIONS: 

L.E. — leading edge R.E. 
T.E. — trailing edge F.E. 

— rising edge 

— falling edge 



CLK^ 

L 2.0 V 
r0.8V 


CLK 



2.0V ' 
0 8V, 

c 



*SIG1 1 «- 









SIG1 

t SIG2h , 3 

[0.8V 

SIG1 


> 

L0.8V 





* H 




*SIG1 1 





SIG2 

/zov 










/ 











TL/EE/8701-13 

SIG2 


/2.0V 




FIGURE 4-2. Timing Specification Standard 
(Signal Valid after Clock Edge) 

TL/EE/8701-14 

FIGURE 4-3. Timing Specification Standard 
(Signal Valid before Clock Edge) 
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4.0 Device Specifications (continued) 

4.4.2 Timing Tables 

4.4.2.1 Output Signals: Internal Propagation Delays, NS32203-10 

Maximum Times Assume Capacitive Loading of 100 pF. 

Name 

Figure 

Description 

Reference/ 

Conditions 

NS32203-10 

Units 

Min 

Max 

*ALv 

4-7 

Address Bits 0-15 Valid 

After R.E., CLKT1 


50 

ns 

*ALh 

4-9 

Address Bits 0-1 5 

Hold Time 

After R.E., CLKT2 

5 


ns 

tAHv 

4-7 

Address Bits 1 6 - 23 Valid 

After R.E., CLKT1 


50 

ns 

tAHh 

4-7 

Address Bits 16-23 Hold 

After R.E., CLKT1 
or Ti 

5 


ns 

tALADSs 

4-8 

Address Bits 0-15 Set Up 

Before ADS T.E. 

25 


ns 

l AHADSs 

4-8 

Address Bits 16-23 Set Up 

Before ADS T.E. 

25 


ns 

*ALADSh 

4-9 

Address Bits 0-1 5 

Hold Time 

After ADS T.E. 

15 


jxS 

*ALf 

4-8 

Address Bits 0-15 Floating 

After R.E., CLKT2 


25 

ns 

*Dv 

4-7 

Data Valid (Write Cycle) 

After R.E..CLKT2 


50 

ns 

*Dh 

4-7 

Data Hold (Write Cycle) 

After R.E., CLKT1 
or Ti 

0 


ns 

l DOv 

4-5 

Data Valid (Reading 

DMAC Registers) 

After R.E., CLKT3 


50 


l DOh 

4-5 

Data Hold (Reading 

DMAC Registers) 

After R.E..CLKT4 

10 



tHBEv 

4-7 

HBE Signal Valid 

After R.E., CLKT1 


50 

ns 

*HBEh 

4-7 

HBE Signal Hold 

After R.E., CLKT1 
or Ti 

0 


ns 

*DDINv 

4-8 

DDIN Signal Valid 

After R.E., CLKT1 


65 

ns 

♦DDINh 

4-8 

DDlN Signal Hold 

After R.E., CLKT1 
or Ti 

0 


ns 

l ADSa 

4-7 

ADS Signal Active 

After R.E., CLK TI 


35 

ns 

tADSia 

4-7 

ADS Signal Inactive 

After R.E., CLKT1 


40 

ns 

tADSw 

4-7 

AD§ Pulse Width 

at 0.8V 
(Both Edges) 

30 


ns 

*ALz 

4-12.4-13 

ADO -ADI 5 Floating 

After R.E., CLKTi 


55 

ns 

UHz 

4-12.4-13 

A16-A23 Floating 

After R.E., CLKTi 


55 

ns 

tADSz 

4-12,4-13 

ADS Floating 

After R.E., CLKTi 


55 

ns 

<HBEz 

4-12, 4-13 

HBE Floating 

After R.E., CLKTi 


55 

ns 

*DDINz 

4-12, 4-13 

DDIN Floating 

After R.E., CLKTi 


55 

ns 

tHLDa 

4-11 

HOLD Signal Active 

After R.E., CLKTi 


50 

ns 

tHLDia 

4-12 

HOLD Signal Inactive 

After R.E., CLKTi 
orT4 


50 

ns 

t|NTa 

4-19, 4-21 

IN? Signal Active 

After R.E., CLKTi 


40 

ns 

*ACKa 

4-16, 4-17, 4-7 

ACKn Signal Active 

After R.E., CLKTI 


50 

ns 

<ACKia 

4-16, 4-17, 4-7 

ACKn Signal Inactive 

After F.E., CLKT4 


35 

ns 
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4.0 Device Specifications (continued) 

Name 

Figure 

Description 

Reference/ 

Conditions 

NS32203-10 

Units 

Min 

Max 

*BGRTa 

4-13 

BGRT Signal Active 

After R.E., CLK 


65 

ns 

♦BGRTia 

4-14 

BGRT Signal Inactive 

After R.E., CLK 


65 

ns 

tlORDa 

4-8, 4-9 

IORD Active 

After R.E., CLK T2 


40 

ns 

tlORDia 

4-8 

IORD Inactive (During 

Indirect Transfers) 

After R.E., CLK T4 


40 

ns 

tlORDia 

4-9 

IORD Inactive (During 

Direct Transfers) 

After F.E., CLKT4 


40 

ns 

tlOWRa 

4-7, 4-10 

IOWR Active 

After R.E., CLKT2 


40 

ns 

{ IOWRia 

4-7 

IOWR Inactive (During 

Indirect Transfers) 

After R.E., CLK T4 


40 

ns 

tlOWRdia 

4-10 

IOWR Inactive (During 

Direct Transfers) 

After F.E..CLKT3 


40 

ns 

4.4. 2. 2 Input Signal Requirements: NS32203-10 

tpWR 

4-22 

Power Stable to 

RST/HLT r.e. 

After Vcc Reaches 
4.75V 

50 


JU.S 

tRSTw 

4-23 

RST/HLT Pulse Width 
(Resetting the DMAC) 

at 0.8V (Both Edges) 

64 


tCp 

tRSTs 

4-24 

RST/HLT Set Up Time 
(Resetting the DMAC) 

Before F.E., CLK 

15 


ns 

tHLTs 

4-18 

RST/HLT Setup Time 
(Halting a DMAC Transfer) 

Before R.E..CLKT3 

25 


ns 

tHLTh 

4-19 

RST/HLT Hold Time 
(Halting a DMAC Transfer) 

After R.E., CLKT4 

10 


ns 

l Dls 

4-6 

Data in Setup Time 

Before R.E., CLK T3 

15 


ns 

l Dlh 

4-6 

Data in Hold 

After R.E., CLK T4 

3 


ns 

tols 

4-6 

Data in Setup Time 
(Writing to DMAC Registers) 

After R.E., CLK T3 

15 


ns 

tDlh 

4-6 

Data in Hold 

(Writing to DMAC Registers) 

After R.E., CLK T4 

3 


ns 

l HLDAs 

4-11,4-12 

HOLDA Setup Time 

Before R.E., CLK 

25 


ns 

*HLDAh 

4-11 

HLDA Hold Time 

After R.E., CLK 

10 


ns 

tRDYs 

4-15 

RDY Setup Time 

Before R.E., 

CLKT2 orT3 

20 


ns 

tRDYh 

4-15 

RDY Hold Time 

After R.E., CLK T3 

5 


ns 

^REQs 

4-16, 4-17 

REQn Setup Time 

Before R.E., CLK 

50 


ns 

*REQh 

4-16, 4-17 

REQn Hold Time 

After R.E., CLK 

10 



tBREQs 

4-13 

BREQ Setup Time 

Before R.E., CLK 

25 


ns 
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4.0 Device Specifications (Continued) 


Name Figure 


Description 
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4.0 Device Specifications (Continued) 



TL/EE/8701-16 

FIGURE 4-5. Read from DMAC Registers 



FIGURE 4-6. Write to DMAC Registers 
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4.0 Device Specifications (Continued) 



FIGURE 4-11. HOLD/HOLDA Sequence Start 



FIGURE 4-12. HOLD/HOLDA Sequence End 

Note 1: DMAC in local configuration. 

Note 2: The HOLD/HOLDA sequence shown above is related to the single transfer mode. 

In burst transfer mode HOLD is deactivated two cycles later. 


TL/EE/8701-23 
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4.0 Device Specifications (Continued) 



TL/EE/8701-26 



TL/EE/8701-27 


FIGURE 4-16. REQn/ACKn Sequence (DMAC Initially Not Idle) 



FIGURE 4-17. REQn/ACKn Sequence (DMAC Initially Idle) 
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4.0 Device Specifications (Continued) 



TL/EE/8701-29 

FIGURE 4-18. Halted Cycle 

Note 1: Halt may occur in previous T-States. It must be applied tor 1 or 2 clock cycles. 

Note 2: If BREQ is asserted in the middle of a DMAC transfer, the transfer will always be completed. 



TL/EE/8701-30 

FIGURE 4-19. Interrupt on Transfer Complete 
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4.0 Device Specifications (Continued) 



FIGURE 4-20. Interrupt on Match/No Match 

Note: II inclusive search is specified a write cycle is performed before INT is activated. 


TL/EE/8701-31 



TL/EE/8701 -32 




FIGURE 4-22. Power on Reset 


FIGURE 4-23. Non Power on Reset 





NS32203-10 



T17 EE/8701 -35 


FIGURE A-1. NS32203 Interconnections in Remote Configuration. 

Note: This logic does not support direct (flyby) DMAC transfers. 


Appendix A. Interfacing Suggestions 






National 
Semiconductor 


PRELIMINARY 


NS32CG821 microCMOS Programmable 
1M Dynamic RAM Controller/Driver 


General Description 

The NS32CG821 dynamic RAM controller provides a low 
cost, single chip interface between dynamic RAM and the 
NS32CG16. The NS32CG821 generates all the required ac- 
cess control signal timing for DRAMs. An on-chip refresh 
request clock is used to automatically refresh the DRAM 
array. Refres hes a nd accesses are arbitrated on chip. If 
necessary, a WAIT output inserts wait states int o mem ory 
access cycles, including burs t mode accesses. RAS low 
time during refreshes and RAS precharge time after refresh- 
es and back to back accesses are guaranteed through the 
insertion of wait states. Separate on-chip precharge coun- 
ters for each RAS output can be used for memory interleav- 
ing to avoid delayed back to back accesses because of 
precharge. 


Features 

■ Allows zero wait state operation 

■ On chip high precision delay line to guarantee critical 
DRAM access timing parameters 

■ microCMOS process for low power 

■ High capacitance drivers for RAS, CAS, WE and DRAM 
address on chip 

■ On chip support for page and static column DRAMs 

■ Byte enable signals on chip allow byte writing with no 
external logic 

■ Selection of controller speeds: 20 MHz and 25 MHz 

■ On board access refresh arbitration logic 

■ Direct interface to the NS32CG16 microprocessor 

■ 4 RAS and 4 CAS drivers (the RAS and CAS configura- 
tion is programmable) 


Control 

# of Pins 
(PLCC) 

# of Address 
Outputs 

Largest 

DRAM 

Possible 

Direct Drive 
Memory 
Capacity 

NS32CG821 

68 

10 

1 Mbit 

8 Mbytes 


Block Diagram 


BANK ADDRESS IN 
ROW ADDRESS IN 
COLUMN ADDRESS IN 


CONTROL INPUTS 


SYSTEM CLOCK 



NS32CG821 DRAM Controller 


ADDRESS LATCH 
(ROW, COLUMN & BANK) 


PROGRAMMING 

REGISTERS 


ARBITER AND WAIT 
LOGIC FOR MEMORY 
ACCESS AND REFRESH 


MEMORY CYCLE 
GENERATOR, 
DELAY LINE, 
BANK SELECT LOGIC 



RAS 

GENERATOR 


CAS 

GENERATOR 
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HPC16083/HPC26083/HPC36083/HPC46083/HPC16003/HPC26003/HPC36003/HPC46003 


National preliminary 

Semiconductor 

HPC16083/HPC26083/HPC36083/HPC46083/ 
HPC16003/HPC26003/HPC36003/HPC46003 
High-Performance microcontrollers 

General Description 

The HPC16083 and HPC16003 are members of the HPCtm 
family of High Performance microcontrollers. Each member 
of the family has the same core CPU with a unique memory 
and I/O configuration to suit specific applications. The 
HPC16083 has 8k bytes of on-chip ROM. The HPC16003 
has no on-chip ROM and is intended for use with external 
direct memory. Each part is fabricated in National’s ad- 
vanced microCMOS technology. This process combined 
with an advanced architecture provides fast, flexible I/O 
control, efficient data manipulation, and high speed compu- 
tation. 

The HPC devices are complete microcomputers on a single 
chip. All system timing, internal logic, ROM, RAM, and I/O 
are provided on the chip to produce a cost effective solution 
for high performance applications. On-chip functions such 
as UART, up to eight 1 6-bit timers with 4 input capture regis- 
ters, vectored interrupts, WATCHDOGtm logic and MICRO- 
WIRE/PLUStm provide a high level of system integration. 

The ability to address up to 64k bytes of external memory 
enables the HPC to be used in powerful applications typical- 
ly performed by microprocessors and expensive peripheral 
chips. The term “HPC16083” is used throughout this data- 
sheet to refer to the HPC16083 and HPC16003 devices un- 
less otherwise specified. 

The microCMOS process results in very low current drain 
and enables the user to select the optimum speed/power 
product for his system. The IDLE and HALT modes provide 
further current savings. The HPC is available in 68-pin 
PLCC, LCC, LDCC, PGA and 84-Pin TapePak® packages. 


Block Diagram (hpci 6083 with 8k rom shown) 



Features 

■ HPC family — core features: 

— 1 6-bit architecture, both byte and word 

— 16-bit data bus, ALU, and registers 

— 64k bytes of external direct memory addressing 

— FAST— 200 ns for fastest instruction when using 
20.0 MHz clock, 134 ns at 30 MHz 

— High code efficiency — most instructions are single 
byte 

— 16 x 16 multiply and 32 x 16 divide 

— Eight vectored interrupt sources 

— Four 1 6-bit timer/counters with 4 synchronous out- 
puts and WATCHDOG logic 

— MICROWIRE/PLUS serial I/O interface 

— CMOS — very low power with two power save modes: 
IDLE and HALT 

■ UART — full duplex, programmable baud rate 

■ Four additional 16-bit timer/counters with pulse width 
modulated outputs 

■ Four input capture registers 

■ 52 general purpose I/O lines (memory mapped) 

■ 8k bytes of ROM, 256 bytes of RAM on chip 

■ ROMIess version available (HPC16003) 

■ Commercial (0°C to + 70°C), industrial (-40°C to 
+ 85°C), automotive (-40“C to +105^) and military 
(-55°C to + 125°C) temperature ranges 
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National 

Semiconductor 


DP8510 BITBLT Processing Unit 


General Description 

The DP8510 BITBLT Processing Unit (BPU) is a high-per- 
formance microCMOS device designed for use in raster 
graphics applications. It implements, in high-speed pipelined 
logic, the data operations which are fundamental to BITBLT 
(BIT boundary Block Transfer) graphics: shifting, masking 
and bitwise logic operations. Under control of external hard- 
ware such as a state machine or a general-purpose micro- 
processor, it provides all necessary data path operations, 
easing the implementation of a wide variety of BITBLT sys- 
tems. A number of input pins control the proper data flow in 
the BPU. A simple handshake scheme is used to interface 
the CPU, the BPU and the memory system. 

The BPU has two modes, BITBLT and line drawing. The 
mode is set by the B/L pin. The line-drawing mode can be 
treated as a special case BITBLT with height and width 
equal to one. 

In order to perform a BITBLT operation, the BPU’s control 
register must first be loaded with four parameters: the shift 
number, left and right masks and the function select code, a 
total of 16 bits. BITBLT can then proceed, as directed by an 
external processor or state machine. It is the responsibility 
of the controller to generate appropriate addresses for the 
BITBLT, to interface with the frame buffer’s memory control 
circuitry, and to control the BPU itself. 


Features 

■ Supports all 16 classical BITBLT functions 

■ Pipelined data input for high system throughput 

■ Flexible architecture allows BPU to be used with a 
state machine or processor 

■ Multiple BPUs can be used for multiple bitplane/color 
applications 

■ Line drawing support 

■ Compatible with static or dynamic RAMs, including 
Video DRAMs 

■ Compatible with page mode, nibble mode and static 
column RAMs 

■ 32-bit to 16-bit barrel shifter 

■ 16-bit data port 

■ 16-word FIFO 

■ 16-bit logic operations 

■ 20 MHz operation 

■ Single +5 volt supply 

■ All inputs and outputs TTL-compatible 

■ Packaged in a 44-pin PCC (commercial) or 44-pin PGA 
(MIL) 

■ Single-bit pixel I/O port 

■ A member of National’s Advanced Graphics Chip Set 

■ microCMOS technology 


Block Diagram 




4-59 


DP8510 





DP8511 



National 

Semiconductor 


DP8511 BITBLT Processing Unit (BPU) 


General Description 

The DP8511 BITBLT Processing Unit (BPU), a member of 
National Semiconductor’s Advanced Graphics Chip Set 
(AGCS), is a high performance microCMOS device intended 
for use in raster graphics applications. Specifically designed 
to complement the DP8500 Raster Graphics Processor 
(RGP), the BPU performs data operations that are elemen- 
tary to BITBLT (BIT boundary Block Transfer) graphics: 
Shift, mask, and bitwise logical manipulation of memory. Un- 
der the control of the RGP, the BPU performs the necessary 
BITBLT data path operations at pipelined hardware speeds. 
A simple set of control lines interfaces the BPU to the RGP 
and to the system memory. 

The BPU has two modes of operation: BITBLT and Line 
Drawing. BITBLT performs shift and logical operations on 
blocks of 16-bit data words. Line drawing performs similar 
operations on single-bit pixel data by utilizing a single bit 
pixel port (PDn). This port allows data read and read-modify- 
write operations on single pixels across a number of bit- 
planes, giving access to pixel depth. The BPU provides both 
pixel level processing commonly used in image processing 
applications and extremely fast planar operations used 
most frequently in color graphics. 

The BPU’s operation is controlled by the values loaded to 
the Control Register (CR) and the Function Select Register 
(FSR). This dual register configuration of the DP851 1 allows 
for high throughput in multi-plane systems that incorporate a 
BPU per plane. This performance advantage is achieved by 
allowing the flexibility of changing the FSR’s contents inde- 


pendent of the CR, so that multiple bitplanes can be updat- 
ed simultaneously while each BPU performs different logical 
operations on its own destination data. 

Features 

■ Interfaces directly to the DP8500 Raster Graphics 
Processor or any general purpose controller 

■ 20 MHz operation 

■ Supports all 16 classical BITBLT functions 

■ Pipelined data input for high system throughput 

■ Provides performance independent of the number of 
bitplanes 

■ Line Drawing support 

■ Compatible with static, dynamic RAMs, and Video 
RAMs 

■ Compatible with page mode, nibble mode and static 
column RAMs 

■ 32-bit to 1 6-bit barrel shifter 

■ 16-bit data port, single bit pixel port 

■ 16-word FIFO 

■ 16-bit logic operations 

■ Single + 5V supply 

■ All inputs and outputs TTL compatible 

■ 2 micron microCMOS technology 

■ Packaged in a 44-pin PCC (commercial) or 44-pin PGA 
(MIL) 


Connection Diagrams 

44-Pin Plastic Chip Carrier (PCC) 



Order Number DP851 IV 
See NS Package Number V44A 
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■ NS32CG16 emulator for software and 
hardware development and debugging 

■ 512 kbytes of mappable memory for 
emulation 

■ 15 MHz, 0 wait state access to emulation 
memory 

■ Sixteen definable events— match on 
address and data, no match on address 
and data and match on status conditions 
(address fetch, data read/write, slave 
cycle and interrupt acknowledge) 

■ Thirty-six software breakpoints using 
NS32CG16’s BPT instruction 

1.0 Product Overview 

The NS32CG16 ISE is a full featured emulator for the 
development of NS32CG 16 based systems. The emu- 
lator works with SYS32/20 and SYS32/30 hosts. Up 
to 512 kbytes of memory may be mapped onto the 
target, allowing users to download their software into 
mapped (or emulation) memory. The emulator sup- 
ports single stepping, 36 software breakpoints and 2 
hardware breakpoints based on any of sixteen pre- 


■ Two hardware breakpoints based on 
events 

■ 2k deep, event triggered, real time, trace 
display in mnemonic and machine 
formats 

■ Execution time measurement with 1 jus 
resolution 

■ On screen menu for command selection 

■ FPU (Floating Point Unit) and BPU (Bit 
Aligned Block Transfer Processing Unit) 
support 

■ Software support via GNX™ tools 

■ Includes PC interface board and cable 


defined events. Events may be defined as match on 
address & data, no match on address & data and 
match on status conditions (address fetch, data read/ 
write, slave cycle and interrupt acknowledge). A 2k 
deep real time trace may be triggered by any of the 
sixteen pre-defined events and displayed in mnemon- 
ic or machine formats. The emulator supports execu- 
tion time measurement with a resolution of 1 jus. 
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1.0. Product Overview (Continued) 

The emulator connects to a high speed parallel inter- 
face board on the development system host. The em- 
ulator connects to the target system via a probe unit 
and target cable. An 1C plug at the end of the target 
cable fits into the CPU socket on the target board. The 
probe unit contains an NS32CG16 microprocessor for 
emulation. 

The emulator software resides in a DOS environment 
on the host. The emulator runs from a DOS environ- 
ment on the host. An on-screen menu enables com- 
mand selection. 

Commands supported by the emulator include: 
Program down-loading 
Assembly language debugging 
Symbolic access to program variables 


Modification of CPU registers and Memory locations 
FPU and BPU slave processor support 
Single stepping and software breakpoints 
Trace display 

On-screen command prompting facility 
Full software support is provided by National’s GNX 
tools in the UNIX® environment of the SYS32/20 or 
SYS32/30 host. The object files produced by the 
compilation (or assembly) and linking process in the 
UNIX environment may be converted into DOS-format 
files and loaded into the emulator. 

2.0 Description of Features 

The NS32CG16 ISE consists of a main emulator unit, 
a probe unit with target cable and 1C plug, an interface 
cable and PC interface board that resides on the host. 
Figure 1 shows a pictorial view of the emulator. 
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2.0 Description of Features (Continued) 

2.1 NS32CG16 ISE System Configuration 

Figure 2 shows the NS32CG16 ISE system configura- 
tion. 

2.2 Description of the System 

The development system consists of the SYS32/20 or 
SYS32/30 host computer with the emulator interface 
board, the emulator and probe units and the 1C plug 
(located at the end of the target cable) which fits into 
the CPU socket on the target board. The emulator 
SCSI interface board enables high speed parallel 
communication between the emulator and the host. 
The probe unit contains an NS32CG16 microproces- 
sor for emulation. 

The emulator unit consists of Controller, Memory, 
Trace and Breakpoint and Probe Interface boards. 
The Controller board communicates with the SCSI in- 
terface on the host and with all other boards in the 
emulator unit. The Probe Interface board communi- 
cates with the probe unit. 

The Memory Board provides 512 kbytes of emulation 
memory, with 0-wait state access at 15 MHz. Sixteen 
memory partitions may be mapped in 4 kbyte blocks 
with write protection capability. 4 kbytes of the avail- 
able memory is used by the emulator’s monitor, and 
the remaining memory may be used for emulation. 
The Trace and Breakpoint board supports trace and 
breakpoint capabilities. The 2k deep trace of address, 
data and status may be displayed in mnemonic or ma- 
chine formats, and may be triggered by any of 16 pre- 
defined events. Two hardware breakpoints (based on 
any of the 16 predefined events) and 36 software 
breakpoints are supported. 

Execution time measurement is accomplished with a 
resolution of 1 /as, and may be measured between two 
instruction execution addresses or between the occur- 
rence of any two of the 16 predefined events. 


Sixteen events may be defined based on the follow- 
ing: 

match on address and data 
no match on address and data 
match on status conditions (address fetch, data 
read/write, slave cycle and interrupt acknowledge) 
In specifying the formats for the address and data, for 
example, any combination of Os, 1 s or Xs (don’t cares) 
may be used. For example FFXO or XXFF (in hexade- 
cimal) are valid formats for specifying address and 
data. 

All symbolic information in the source program is re- 
tained during debugging. 

The emulator software resides in a DOS environment 
on the system host. The emulator runs from the DOS 
environment and may be invoked from the DOS direc- 
tory in which the emulator software resides and com- 
mands may then be issued to control the operating 
mode of the emulator. An on-screen menu enables 
selection of commands with prompting facility. Com- 
mands are provided to download, execute and debug 
programs. The command structure supports symbolic 
access to program variables. 

Software support is provided by National’s GNX tools 
in the UNIX environment on the SYS32 host. A user 
program may be edited, compiled and linked in this 
environment to obtain an executable object file. The 
object file may then be converted into DOS-format 
and copied into the DOS environment, by using the 
udcp (UNIX to DOS copy) utility in the UNIX environ- 
ment. This resulting DOS-format file may be directly 
loaded into emulation memory by emulator com- 
mands. The ducp (DOS to UNIX copy) utility may be 
used (in the UNIX environment) to convert files in the 
DOS-format (in the DOS environment) to UNIX-format 
(in the UNIX environment). Both udcp and ducp also 
support conversion of ASCII files. 



FIGURE 2. NS32CG16 ISE System Configuration 
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2.0 Description of Features (Continued) 

2.3 The Development Process 

Figure 3 shows the development process in the different environments. 


DOS 

Emulator Software 

dos_QUJT^___j^unh<__ 

UNIX 

Edit, Compile/Assemble & 
Link -> Object file 
udcp, ducp utilities 


DOS 

Emulator Software 


EMULATOR 
Program Loading 
Program Execution 
Program Debug 


DOS 

Emulator Software 


FIGURE 3. The Development Process 

2.4 Command Summary 

The following is a summary of the commands supported by the emulator. 


CONFIGURATION COMMANDS 


Mapping address Thru address Rom|RAm|TArgetjLocked 

Maps 4 kbyte memory blocks in the specified address range as ROM, RAM, Target or Locked memory space. 

MOnitor address 

This command maps a single 4 kbyte memory block at specified address for use by the monitor. 

Interrupt Enable) Disable Nmi lnt 

Enables or Disables the selected interrupt NMI or INT. 

DMa Enable|Disable 

Enables or Disables DMA transfers when the CPU is not accessing the bus. 

Break Enable|Disable Monitor|Rom write 

Enables or Disables a break in program execution when an access to Monitor address space or a write to the 
ROM address space occurs. 

Load Coff|Sform file Offset 

Loads a specified file in COFF or Motorola S formats into memory at a specified offset from address 0. 

Store file From address Thru address 

Stores the program data in the specified address range in memory into the specified file in Motorola S format. 

Clear 

Clears all the symbols used in the program. 







2.0 Description of Features (Continued) 

2.4 Command Summary (Continued) 

The following is a summary of the commands supported by the emulator. 

DISPLAY COMMANDS 
Display Configuration 

Displays the current configuration of the emulator. 

Display Register Format General|Single|Double 
Displays CPU registers in the specified format. 

Display Memory address Format BytejWord|Dword|Qword|Mnemonic|Single|Double 

Displays memory contents starting at specified address in the specified format. 

Display Trace Trigger|TOp|Bottom|line Mnemonic|MAchine 

Displays results of the trace with the specified display position and display format. 

The display position may be specified at the T rigger point or the top of the trace or the bottom of the trace or a 
specified line number on the trace. 

The display format may be specified to be in mnemonic or machine formats. 

Display SWbreak 

Displays all the software breakpoints. 

Display Event 

Displays all the pre-defined events. 

DATA MANIPULATION COMMANDS 

Register Format General|Single|Double 

Specifies the display and change formats for register commands. 

MOdify reg To data 

Modifies the specified register to the specified data. 

Memory address Format Byte|Word|Dword|QwordlMnemonic|Single|DOuble 

Specifies the display and change formats for memory commands. 

MOdify address Thru address To data 

Modifies the memory locations in the specified address range to the specified data. 

EVENT SETUP COMMANDS 

Event 

Initiates the event definition process. 

Add Address =|# address Data = |# data Status Off|Fetch[Data|DRead|DWrite|lntack|Slave 

Adds an event with specified address match or nomatch, with specified data match or nomatch, and specified 
status conditions. 

Replace number Address = | # address Data 
= | # data Status Off|Fetch|Data|DRead|DWrite|lntack|Slave 

Replaces the event with the specified event number with the new event defined with the specified address 
match or nomatch, with specified data match or nomatch, and specified status conditions. 

DELete All|number 

Deletes all currently defined events or the event with the specified event number. 
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2.0 Description of Features (Continued) 

2.4 Command Summary (Continued) 

The following is a summary of the commands supported by the emulator. 

SOFTWARE BREAKPOINT COMMANDS 


SWbreak 

Initiates the setup of software breakpoints. 

Add address 

Adds a software breakpoint at specified address. 

Replace number To address 

Replaces the breakpoint address of a pre-defined breakpoint (referenced by the specified number) with the 
new specified address. 

DELete All|number 

Deletes all the pre-defined software breakpoints or the pre-defined breakpoint with the specified number. 

Set Enable| Disable Alljnumber 

Enables or disables the state of all pre-defined software breakpoints or the pre-defined software breakpoint 
(referenced by the specified number). 


PROGRAM EXECUTION COMMANDS 


RESet 

Resets the CPU. 

Go From address Until addressl| Event# Or address2| Event# Times number 

Executes program from specified address until a match occurs on the specified address (addressl) or on the 
specified event (hardware breakpoint #1), or until a match occurs on the specified address (address2) or on 
the specified event (hardware breakpoint #2). A specified number of times a specified match occurs may also 
be used to control program execution. If the hardware breakpoint conditions are omitted, then program execu- 
tion breaks on the software breakpoints that may be set and enabled. 

Step From address 

Executes one instruction from the specified address. 

Trace From address Trigger addressl | Event# Or address2|Event# 

Enables the trace from the specified address, with the trigger points being defined by addressl or a specified 
event or by address2 and a specified event. 

MEAsure From address Start addressl | Event# End address2| Event# 

Enables program execution from specified address with execution time being measured from specified start 
addressl or event until the specified end address2 or event. 

Quit 

Forces a break in program execution and stops the CPU. 


EMULATOR CONTROL COMMANDS 


CANcel 

Resets the emulator to its initial state at start-up. 

EXIT 

Exits from the emulator environment to the DOS environment. 

DOS 

Suspends temporarily to the DOS environment from the emulator environment. 

MAcro file 

Executes command lines stored in the specified macro file in text format. 
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3.0 Specifications 

Environment The NS32CG16 ISE is designed to op- 
erate in a laboratory environment. The 
emulator unit may be mounted horizon- 
tally (flat) or vertically. 

Temperature Operative: + 1 5°C to + 50°C 
Storage: — 40°Cto +60°C 
Humidity 10% to 90% relative, non-condensing 

Altitude Operative 1 5000 feet 

Power NS32CG16 ISE requires a standard 

Requirements AC power outlet (125V AC). 


4.0 Ordering Information 

NSS-ISE-CG16 NS32CG16 Emulator. 
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■ 15 MHz NS32332/NS32382 Add-In board 
for an IBM® PC/AT® or compatible 
system 

■ 2-3 MIP system performance 

■ No wait-state, on-board memory in 4-, 8- 
or 16-Mbyte configurations 

■ Operating system derived from AT&T’s 
UNIX® System V Release 3 

■ Multi-user support 

■ GENIX™ Native and Cross-Support 
(GNXtm) language tools, includes— 
assembler, linker, libraries, debuggers 

Product Overview 

The SYS32™/30 is a complete, high-performance 
development package that converts an IBM PC/AT or 
compatible computer into a powerful multi-user sys- 
tem for developing applications that use National 
Semiconductor Embedded System Processors - ™ or 
Series 32000 microprocessor family components. The 
SYS32/30 add-in processor board containing the Se- 
ries 32000 device cluster with the NS32332 micro- 
processor allows programs to run on a personal 


■ Support for other Series 32000® 
development products: 

— SPLICE 

— National’s Series 32000 Development 
Board family 

— Optimizing Compilers: C, 

FORTRAN 77, Pascal 

■ Easy-to-use DOS/UNIX interface 


computer at speeds greater than those of a VAX™ 
11/780. The chip cluster on the processor board in- 
cludes the NS32332 Central Processing Unit, 
NS32382 Memory Management Unit, NS32C201 Tim- 
ing Control Unit and the NS32081 Floating-Point Unit. 
Along with the processor board, the SYS32/30 pack- 
age contains the Opus5™ operating system which is 
derived from GENIX V.3, National Semiconductor’s 
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Product Overview (Continued) 
port of AT&T’s UNIX System V Release 3. Specially 
developed software is included to efficiently integrate 
the NS32332 processor board and the host PC/AT 
processor, allowing them to function as a complete 
UNIX computer system. National’s Series 32000 GE- 
NIX Native and Cross-Support (GNX) language tools 
are included in the SYS32/30 package to provide sta- 
ble and effective tools for software development. Op- 
tional compilers are available for FORTRAN 77, C, 
and Pascal. 

Functional Description 

15 MHz ADD-IN PROCESSOR BOARD FOR AN IBM PC/AT 
OR COMPATIBLE SYSTEM 

The SYS32/30 development package contains a 
processor board designed around the Series 32000 
chip set. This chip set includes the NS32332 Central 
Processing Unit, NS32382 Memory Management Unit, 
NS32C201 Timing Control Unit, and the NS32081 
Floating-Point Unit. 

This processor board forms the high-performance 
center of the computer system with the host PC/AT 
processor. Peripherals are under the control of the 
PC/AT’s microprocessor and are located either on the 
PC/AT motherboard or on other boards in the PC/AT 
chassis. The PC/AT handles all direct access to de- 
vices and serves as an integral dedicated I/O proces- 
sor. 


The SYS32/30 processor board plugs into the PC/AT 
bus, uses the standard control and data signals, and 
appears to the PC/AT as 16 bytes in the PC/AT In- 
put/Output (I/O) space. Communication between the 
PC/AT and the board is accomplished via this ad- 
dress space. This architecture allows the board to in- 
terface to the PC/AT in the same manner as any other 
PC/AT peripheral. The PC/AT processes I/O com- 
mands while the SYS32/30 processor board contin- 
ues with regular operation. I/O is requested via inter- 
rupt to the PC/AT, which then performs the data 
transfer using Direct Memory Access (DMA). (See Fig- 
ure 1). 

The processor board requires two slots in the PC/AT 
motherboard and plugs into a single long 16-bit bus 
slot. The space of the second slot is needed to ac- 
commodate the piggybacked memory board attached 
to the processor board. No additional connections are 
required. 

2-3 MIPS SYSTEM PERFORMANCE 

The NS32332 CPU and associated devices operating 
at 15 MHz provide computing power greater than that 
of a VAX 11/780. Sustained performance for the 
NS32332 device cluster is 2-3 VAX MIPS (Million In- 
structions Per Second). An example of relative per- 
formance using the widely recognized Dhrystone 
benchmark is shown in Figure 2. 



FIGURE 1 
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Functional Description (Continued) 



FIGURE 2. SYS32/30 Dhrystone Program 
Compiled with GNX Version 3 C Compiler 
VAX 11/780 Dhrystone Data Obtained from USENET 

ON-BOARD MEMORY CONFIGURATIONS 
OF 4, 8 OR 16 MBYTES 

The processor board is configured with either 4, 8, or 
16 Mbytes of zero wait-state physical memory. It is 
possible to upgrade the 4- or 8-Mbyte configuration to 
16 Mbytes through the purchase of an optional 16- 
Mbyte memory card. 

OPERATING SYSTEM 

The SYS32/30 operating system is derived from 
GENIX V.3, National Semiconductor’s port of 
AT&T’s UNIX System V Release 3. 

The UNIX operating system is a powerful, multi-user, 
multitasking operating system that includes the follow- 
ing key features: 

Demand-Paged Virtual Memory 

Hierarchical file system 

Source Code Control System (SCCS) 

UNIX to UNIX copy (uucp) 

“make” utility 

Menu-driven system administration 
The UNIX operating system has a proven reputation 
as an effective and productive environment for effi- 
cient software development. UNIX allows multiple us- 
ers to work simultaneously on the same computer and 
project. The Source Code Control System (SCCS) au- 
tomatically tracks program revisions as development 
work progresses. The “make" software saves valu- 


able time in regenerating complex software systems 
after changes are made. The uucp software allows 
users on different UNIX systems to communicate us- 
ing electronic mail and to transfer files over dial-up or 
serial communications links. Menu-driven system ad- 
ministration is available for system setup, adding us- 
ers, controlling communication lines, installing soft- 
ware packages, changing passwords, and other ad- 
ministrative functions. 

ADDITIONAL SUPPORT UTILITIES 
Many of the popular utilities from the Berkeley 4.3 
UNIX operating system, not contained in AT&T’s UNIX 
System V Release 3, are supplied as part of the pack- 
age. These utilities are listed in Table I. 

TABLE I. Bsd 4.3 Utilities 


The Tools for Documenters package, derived from the 
AT&T Documenter’s Workbench^ Utility, provides 
the Series 32000 programmer with the tools to pre- 
pare documentation. The major components of this 
package are shown in Table II. 

TABLE II. Tools for Documenters Utilities 

Name Description 

nroff A text formatter for line printers 

troff A text formatter for typesetters 

mm A macro package 

mmt A macro package 

eqn A troff preprocessor for typesetting 

mathematics on a phototypesetter 

neqn A troff preprocessor for typesetting 

mathematics on a terminal 

tbl A preprocessor for formatting tables 

pic A preprocessor for graphic illustrations 

col A filter to nroff for processing multicolumn 

text output, as from tbl 

NETWORKING CAPABILITY 

The SYS32/30 based development system config- 
ured to support networking using the TCP/IP protocol 
allows project development using multiple systems, in- 
cluding SYS32/30 based systems, VAX/VMS™ (us- 
ing TCP/IP), SUN-3/SunOSTM and VAX/ULTRIX. The 
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Functional Description (Continued) 

compatibility design of the GNX language tools allows 
software modules developed on these networked sys- 
tems to be linked together on a single system for exe- 
cution as one program. Networking requires that addi- 
tional hardware and software be installed in the sys- 
tem. Third party products that enable networking are 
listed in the SYS32/30 configuration guide. 

MANUALS 

A complete manual set for the operating system and 
related software is included in the SYS32/30 pack- 
age. This includes: 

Installation instructions for the PC Add-in board 

Installation instructions for software 

UNIX System V.3 reference manuals and user guides 

GNX Language Tools Manuals 

Tools for Documentors Reference Manual 

Berkeley Utilities Manual 

MULTI-USER SUPPORT 

The SYS32/30 operating system is an interactive, 
multi-user, multitasking operating system. Many activi- 
ties or jobs can be performed simultaneously when 
serial ports are added to the host system. These addi- 
tional serial ports are used for terminals, printers, mo- 
dems, l/O-to-development boards, l/O-to-target hard- 
ware, or for communication with National’s SPLICE 
debugging tool. Information about third party products 
that provide additional serial ports is contained in the 
SYS32/30 configuration guide. 

GNX LANGUAGE TOOLS 

The GENIX Native and Cross-Support (GNX) lan- 
guage tools allow the user to compile, assemble, and 
link user programs to create executable files. These 
files can then be executed and debugged on a Series 
32000 development board, target system application 
hardware, or a 32000/UNIX-based system such as 
the SYS32/30. 

The GNX language tools include the assembler, link- 
er, debuggers, libraries, and the monitor software for 
all Series 32000 development boards in both PROM 
and source code form. 

The Series 32000 GNX language tools are based on 
AT&T’s Common Object File Format (COFF). Under 
COFF, object modules created by any of the GNX 
compilers or the GNX assembler may be linked to 
object modules of any other translator in the GNX 
tools. Optimizing compilers are available for C, 
FORTRAN 77, and Pascal. 

The COFF file format also allows object modules that 
have been created by the GNX tools on other devel- 


opment hosts (VAX/VMS or VAX/ULTRIX, for exam- 
ple) to be linked with modules created on the 
SYS32/30 system. This flexibility is most valuable 
where non-centralized software development is de- 
sired and the systems are able to transfer or share 
files via a common network. Information for configur- 
ing the SYS32/30 for integration into a network is 
contained in the configuration guide. 

Compilers are available separately as optional soft- 
ware to allow individual selection of the application 
language. The C, FORTRAN 77 and Pascal compilers 
are the result of National’s optimizing compiler project 
and reflect state-of-the-art compiler technology for op- 
timizing execution speed. For additional details about 
the GNX tools consult the GNX tools data sheet. 

SUPPORT FOR AN INTEGRATED DEVELOPMENT 
ENVIRONMENT 

The SYS32/30 contains the functionality and compati- 
bility needed to utilize other tools available from Na- 
tional Semiconductor for developing and debugging 
Series 32000-based applications. These tools include 
the SPLICE software debugger, NS32GG16-ISE, the 
Series 32000 Development Board set, and National’s 
Embedded System Processor evaluation boards for 
the NS32CG16 and NS32GX32 processors. 

The NS32CG16 ISE is a full featured emulator for de- 
velopment of NS32CG16 based systems. Software is 
developed on the SYS32/30, then transferred to the 
DOS partition of the development system for down- 
load by the ISE. 

The SPLICE development tool provides a communica- 
tion link between a Series 32000 target and a devel- 
opment system host. This connection allows users to 
download and map their software onto target memory 
and then debug this software using National Semicon- 
ductor’s GNX debugger. Consult the SPLICE data 
sheet for more information. 

The GNX debugger also directly supports the Hewlett- 
Packard HP64772 NS32532/NS32GX32 in-system 
emulator. This combination provides powerful inte- 
grated support for high-level source debugging and in- 
system emulation of the NS32532 or NS32GX32 proc- 
essors. 

The Series 32000 development boards and Embed- 
ded System Processor evaluation boards used with 
the SYS32/30 are specifically designed to assist the 
user in evaluating and developing hardware and soft- 
ware for embedded systems and the Series 32000 
family of CPUs. 
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Functional Description (Continued) 

DOS/UNIX INTEGRATION 

The SYS32/30 PC add-in development package al- 
lows easy transfer of data between DOS and the 
UNIX operating system. A system console user can 
switch between either operating system using only a 
few keystrokes. A shell interface allows DOS com- 
mands to be executed from the UNIX shell, UNIX 
commands to be executed from DOS, and files to be 
transferred between the UNIX and DOS partitions on 
the system disk. In addition, the user can suspend the 
SYS32/30 operation, enter DOS, run an application, 
and then return to the SYS32/30 environment. 

Series 32000 Application Development 

The SYS32/30 with the PC/AT operates as a local 
host computer system for integrating application soft- 
ware into target prototype boards containing Series 
32000 components. Programs can be written in as- 
sembly language or in a higher level language. Option- 
al compilers are available for C, FORTRAN 77, and 
Pascal. 

During compilation, the compilers generate assembly 
code which is assembled by the GNX assembler. (See 


Figure 3.) The output of the assembler is an object file 
which can be linked to other object file and/or librar- 
ies, resulting in an executable file. 

Since the SYS32/30 provides a Series 32000 native 
environment, the executable file may be run on the 
host SYS32/30 system or loaded into RAM on either 
a target system, an Embedded System Processor 
evaluation board or one of the Series 32000 develop- 
ment boards. The source-level software debuggers in 
the GNX tools provide powerful facilities for debug- 
ging software on the target system. 

The GNX debugger is capable of downloading and 
controlling the execution of software on the target sys- 
tem. Executable monitor software is provided in 
PROMs in the SYS32/30 package for the Series 
32000 development boards and the Embedded Sys- 
tem Processor evaluation boards. Monitor software is 
also provided in source form in the GNX language 
tools so application designers can modify and port the 
monitor to suit the needs of their target system. 

After debugging, the executable file created by linking 
can also be converted to PROM format using the GNX 
nburn utility. 
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Configuring a System 

The SYS32/30 PC Add-In package supports a variety 
of configurations. Based on developer needs, the final 
configuration may need extra serial I/O ports, and/or 
networking capability. A hard disk of sufficient size is 
also an important part of the configuration. A configu- 
ration guide that outlines available options and recom- 
mended products for configuring the SYS32/30 devel- 
opment system is available. 

Host system elements required for SYS32/30 opera- 
tion are: 

— IBM PC/AT or compatible system 

— Two full length slots in the motherboard 

— 512 Kbytes of RAM 

— PC-DOS 3.1 or later 

— 1 .2-Mbyte floppy disk drive 

— Adequate hard disk storage (see the next section 
on disk size) 

Note: The SYS32/30 processor board actually plugs into a single slot. 
The second slot is required to accommodate the space taken by 
the piggybacked memory board attached to the NS32332 proces- 
sor board. 

The SYS32/30 PC/AT Add-In Development Package 
runs on an IBM PC/AT or compatible computer. If an 
IBM PC/AT is not used for the host system, it is impor- 
tant to remember that compatibility can vary between 
IBM PC/AT compatible systems. The SYS32/30 proc- 
essor board may not be adequately supported by sys- 
tems that lack full IBM PC/AT compatibility. The con- 
figuration guide available contains a list of IBM PC/AT 
compatible systems that have the required compatibil- 
ity. 

HARD DISK CAPACITY 

Several factors influence the size selected for a hard 
disk. Consideration should include the number of us- 
ers for the system, space for user files, the size of the 
application to be developed, and extra software pack- 
ages and compilers that must reside on the system. 
For example, a 50-Mbyte hard disk is the minimum 
size recommended for a SYS32/30-based develop- 
ment environment. This provides sufficient space for a 
single-user account, the UNIX operating system and 
utilities, the GNX tools, compiler software, basic DOS 
software, and a moderate size application. Disk drives 
with even greater capacity than the minimum sizes in- 
dicated here should be considered for additional users 
or software and to provide for growth of the system. 
When selecting hard disk drives or other peripheral 
devices, it is important that the device conform to the 
industry-standard for peripheral devices designed for 
use on the PC/AT bus. 


Basic Kits 

The SYS32/30 Add-In Development package is avail- 
able in three basic kits: 

NSS-SYS30-KIT1 For IBM-AT and compatible 
systems 

PC Add-In coprocessor board 
with 4 Mbytes on-board memo- 
ry 

UNIX System V.3 based operat- 
ing system 

GNX Language Tools 
Tools for Documenters 
Berkeley Utilities 
Installation instructions for the 
PC Add-In board 
Installation instructions for soft- 
ware 

UNIX System V.3 reference 
manuals and user guides 
GNX Language Tools Manuals 
Tools For Documenters Refer- 
ence Manuals 
Berkeley Utilities Manual 

NSS-SYS30-KIT2 Same as KIT1 except with 

8 Mbytes of on-board memory 
NSS-SYS30-KIT3 Same as KIT1 except with 

16 Mbytes of on-board memory 

MEMORY UPGRADE 

To upgrade the memory size to 16 Mbytes after the 
purchase of KIT1 or KIT2, the following 16-Mbyte 
memory board must be purchased to replace the ex- 
isting memory board: 

NSS-SYS30-MEM16 16-Mbyte memory board. 

Optional Software Packages 

(A prerequisite for use is the purchase of one of the 
above basic kits). 

NSW-C-3-BHBF3 Optimizing C Compiler 
NSW-F77-3-BHBF3 Optimizing FORTRAN 77 Com- 
piler 

NSW-PAS-3-BHBF3 Optimizing Pascal Compiler 
NSW-NET-BHBF3 Networking software 
NSP-SYS32/V3-MS Additional operating system 
manual set 
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Series 32000 GENIX Native and Cross-Support (GNX) Development Tools (Version 3) 


^ National Semiconductor 


Series 32000® GENIX™ Native and 
Cross-Support (GNX) Development Tools 

(Version 3) 



Complete software development 
environment for Series 32000 
Supports software development on 
VAX™, Sun-3®, and SYS32™ 
development hosts 
Supports Common Object File Format 
(COFF) 

Includes versatile configuration 
definition utility 


Includes source code for board-level 
monitors 

Includes complete floating-point unit 
emulation software 

Supports optional C, FORTRAN 77, and 
Pascal optimizing compilers 
Supports SPLICE development tool 


Introduction 

The Series 32000 GNX-Version 3 (GENIX Native and 
Cross-Support) development tools consist of assem- 
bler, linker, debuggers, monitors, basic I/O routines, 
libraries, optional high-level language compilers, and 
other tools to aid in the development of applications 
for the Series 32000 microprocessor family. The GNX 
tools allow users to compile, assemble, and link appli- 
cation programs to create executable files. These files 
can then be executed and debugged on Series 32000- 
based development hosts, such as the SYS32/20 and 
SYS32/30, or on a Series 32000-based target board. 
After debugging, the executable files can be convert- 


ed to binary/hexedecimal files suitable as input to 
PROM programmers for burning PROMs. 

The Series 32000 GNX development tools are based 
on the Common Object File Format (COFF), as devel- 
oped by AT&T and enhanced by National Semicon- 
ductor Corporation. This allows files developed on dif- 
ferent hosts and in different high-level languages to 
be easily integrated. 

Supported Development Hosts 

The Series 32000 GNX development tools are avail- 
able hosted for cross-development on the VAX se- 
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Supported Development Hosts (Continued) 



• Libraries are maintained by AR. 

TL/EE/10418-2 

FIGURE 1. Sample Development Process 


TABLE I. Commands for SYS32, 
VAX/UNIX, and VAX/VMS 


SYS32 

VAX/UNIX 

VAX/VMS 

ar 

nar 

nar 

as 

nasm 

nasm 

cc 

nmcc 

nmcc 


ncmp 

ncmp 

dbg32 

dbg32 

dbg32 

Ml 

nf77 

nf77 

gts 

gts 

gts 

Id 

nmeld 

nmeld 

lorder 

nlorder 


monfix 

monfix 

monfix 

nburn 

nburn 

nburn 

nm 

nnm 

nnm 

pc 

nmpc 

nmpc 

size 

nsize 

nsize 

strip 

nstrip 

nstrip 


ries of computers, running the VMS™, UNIX® (bsd), 
and ULTRIX operating systems and on a Sun-3 work- 
station running SunOS™. Also supported are National 
Semiconductor’s SYS32/20 and SYS32/30 develop- 
ment environments. Table I summarizes the GNX 
commands for each environment. 

The SYS32/20 and SYS32/30 PC-Add-ln Develop- 
ment Packages are complete, high-performance 
packages that convert an IBM-PC/AT™ or compati- 
ble computer into a powerful multi-user system for de- 
veloping applications that use the Series 32000 fami- 
ly. The SYS32 systems are based on the Series 
32000 processor family; the SYS32/20 includes an 
NS32032 Central Processing Unit, and the SYS32/30 
is based on the NS32332 CPU. Both the SYS32/20 
and SYS32/30 run a derivative of the AT&T System 
V.3 UNIX operating system. Because these host sys- 
tems are themselves based on the Series 32000 proc- 
essor family, application code can be debugged on 
the host system without down-loading to target hard- 
ware. 

Figure 1 illustrates a typical development process. 
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Tools Components 

The GNX Development Tools comprise the following 
utilities and support libraries: 

Ar 

This utility maintains groups of files combined into a 
single archive file. Ar is used to create and update 
library files as used by the GNX linker Id. 

As 

The GNX assembler, as, assembles Series 32000 as- 
sembly language source programs and generates re- 
locatable object modules. Relocatable object modules 
must be linked to create executable load modules. 
DBG32 

DBG32 is an interactive symbolic debugger. It can be 
used for remote debugging in conjunction with a host 
and any target hardware that includes a Series 32000 
GNX monitor. DBG32 allows source-level debugging 
and includes an easy-to-use on-line help facility. 
Floating-Point Enhancement and 
Emulation (FPEE) Library 

When a floating-point unit (FPU) is not present, the 
floating-point enhancement and emulation (FPEE) li- 
brary provides low-cost floating-point support by emu- 
lating the Series 32000 FPU instructions. When an 
FPU is present, FPEE enhances the FPU by providing 
additional functionality as recommended by Draft 10 
of the ANSI/IEEE Task 754 Proposal for Binary Float- 
ing-Point Arithmetic (IEEE 754). FPEE meets the IEEE 
754 standard for double-precision arithmetic. 

The FPEE library is provided in source form and as a 
binary library suitable for its particular GNX tool-set 
environment. The source includes all support routines 
necessary to build the FPEE library. The FPEE library 


can be configured to enhance/emulate either the 
NS32081 FPU or the NS32381 FPU. 

GNX Target Setup (GTS) 

The GNX tools support the full line of Series 32000 
central processing units and peripheral devices, 
based on user-defined parameters. The GNX Target 
Setup (GTS) utility allows users to easily define the 
characteristics of the target system at one time. This 
information is saved in a file on the host system, which 
is examined each time a GNX utility is invoked. These 
parameters are used to tailor the application code to 
characteristics of the particular hardware. 

GTS operates both interactively and non-interactively 
and includes an easy-to-use interface and on-line help 
facility. 

Ld 

The GNX linker, Id, creates executable files by com- 
bining object files, providing relocation, and resolving 
external references. The linker also processes sym- 
bolic debugging information. The linker includes a 
powerful directives language, which allows the user to 
precisely control the linking process. 

Lorder 

Lorder finds ordering relations for object libraries. The 
input may be one or more object or library archive 
(see ar) files. The output of lorder can be processed 
to find an ordering of a library suitable for one-pass 
access by the linker. 

Math Libraries 

The math libraries (libm.a and Iib381m.a) contain stan- 
dard math functions that support both the NS32081 
and NS32381 floating-point units. These functions are 
highly optimized for the Series 32000 architecture. 


Table II contains a list of the available math functions. 


TABLE II. Available Math Functions 


acos 

exp 

fdrem 

fmod 

fpow 

log Ip 

acosh 

exp2 

fexp 

fneg 

fpstrpvctr 

log2 

asin 

expml 

fexp2 

fp — gmathenv 

frelation 

neg 

asinh 

fabs 

fexpml 

fp — getexptn 

frem 

nextdouble 

atan 

facos 

ffabs 

fp — getround 

frint 

nextfloat 

atan2 

facosh 

ffinite 

fp — gettrap 

fsin 

Pi 

atanh 

fasin 

ffloor 

fp — procentry 

fsinh 

pow 

bessel 

fasinh 

ffmod 

fp — proeexit 

fsqrt 

randomx 

cabs 

fatan 

fhypot 

fp — smathenv 

ftan 

relation 

cbrt 

fcabs 

finf 

fp — setexptn 

ftan2 

rem 

ceil 

fcbrt 

finite 

fp — setround 

ftanh 

rint 

compound 

fceil 

flog 

fp — settrap 

gamma 

sin 

copysign 

fcompound 

floglO 

fp— testrap 

hypot 

sinh 

cos 

fcopysign 

floglp 

fp — tstexptn 

inf 

sqrt 

cosh 

fcos 

flog2 

fpgtrpvctrv 

log 

tan 

drem 

fcosh 

floor 

fpi 

loglO 

tanh 


Note: All math library functions are provided in single and double precision versions. 
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Tools Components (Continued) 

Monitors 

Mon16, mon32, mon332, mon332b, mon532 and 
mon32GX are PROM-based firmware monitors for use 
on a Series 32000-based development board. The 
monitors allow the user to load, execute, and debug 
development board programs with the dbg32 debug- 
ger running on a host computer system. The monitors 
also provide run-time services, such as physical I/O, 
interrupt handling, and error handling in the form of 
supervisor calls. 

Source to each monitor is provided so that it may be 
modified, assembled, linked, and installed on other 
Series-32000 based target boards. 

Monfix 

Monfix is a utility that creates a Series 32000 boot- 
strap program by modifying a Series 32000 GNX exe- 
cutable file. 

Nburn 

Nburn loads the specified bytes of a file to an EPROM 
burner in one of several user-specified formats, includ- 
ing ASCII-HEX and S-record. 

Nm 

The nm utility displays the symbol table of a Series 
32000 GNX object file. 

Size 

The size utility displays size information for each sec- 
tion and optional header information of a Series 32000 
GNX object file. 

Strip 

The strip utility strips symbol and line number infor- 
mation from a Series 32000 GNX object file. 

Optional Compilers 

A substantial amount of application code is developed 
in a high-level language; therefore, the speed and effi- 
ciency of the application are functions not only of 
processor speed, but also of quality of code generat- 
ed by the high-level language compiler. An inefficient 
compiler can extract a significant performance penal- 
ty. Likewise, a significant performance improvement 
can be achieved for a much lower cost in software 
rather than hardware. For this reason. National Semi- 
conductor has developed a line of optimizing compil- 
ers that generate extremely efficient code for the Se- 
ries 32000 architecture. 

Each of the optimizing compilers includes the state-of- 
the-art GNX optimizer, based on advanced optimiza- 
tion theory developed over the past 15 years. In addi- 
tion, because all GNX-Version 3 optimizing compilers 
use a standard calling sequence, internal intermediate 


representation, and object file format, mixed-language 
programming is greatly simplified, aiding in the porting 
of existing applications to the Series 32000 architec- 
ture. 

C Optimizing Compiler 

The GNX-Version 3 C Optimizing Compiler fully imple- 
ments the C programming language, as defined in The 
C Programming Language by B. Kernighan and D. Rit- 
chie. The C Optimizing Compiler is also compatible 
with the UNIX System V C compiler, derived from the 
portable C compiler (pcc). Several features of the 
draft ANSI C standard (X3J11) are supported. 

FORTRAN 77 Optimizing Compiler 
The GNX-Version 3 FORTRAN 77 Optimizing Compil- 
er fully implements the FORTRAN 77 programming 
language, as defined by the American Standard publi- 
cation Programming Language FORTRAN (ANSI 
X3.9-1978). In addition, a command-line option is pro- 
vided that forces the compiler to accept as input only 
programs that adhere to the FORTRAN 66 standard. 

Pascal Optimizing Compiler 
The GNX-Version 3 Pascal Optimizing Compiler fully 
implements the Pascal programming language, as de- 
fined by the International Standards Organization 
(ISO) standard ISO dp7185 level 1. Several useful 
extensions to the Pascal language are supported. A 
command-line option is provided that forces the com- 
piler to accept as input only programs that adhere to 
the ISO standard. 

SPLICE Support 

The GNX development tools enable the use of the 
SPLICE development tool, which can be used to de- 
bug software/hardware on a Series 32000 target. 
SPLICE provides a communication link between a Se- 
ries 32000 target and a development system host that 
allows users to down-load and map their software 
onto target memory and debug this software using the 
dbg32 debugger. The monitor resident on the SPLICE 
communicates with dbg32 on the development host. 

Source Products 

The GNX development tools, as well as the optional 
optimizing compilers, are available in source form for 
use in porting to other potential development environ- 
ments. Source code is provided on a VAX/UNIX bsd 
tape. Contact Series 32000 Marketing for more infor- 
mation regarding GNX source availability. 


5-19 


Series 32000 GENIX Native and Cross-Support (GNX) Development Tools (Version 3) 



Series 32000 GENIX Native and Cross-Support (GNX) Development Tools (Version 3) 


Licensing 

All binary versions of the Series 32000 GNX develop- 
ment tools require the execution of National Semicon- 
ductor’s binary user agreement. Because the GNX de- 
velopment tools contain AT&T proprietary code, a 
System V source license is prerequisite for obtaining a 
source version of the GNX tools. Contact Series 
32000 Marketing for more information regarding spe- 
cific licensing issues. 

Customer Support 

National Semiconductor offers a full 90-day warranty 
period. Extended warranty provisions can be arranged 
by calling National Semiconductor’s Technical Sup- 
port Engineering Center at the toll-free number listed 
below. 

National Semiconductor’s Technical Support Engi- 
neering Center has highly trained technical specialists 
available to assist customers over the telephone with 
any product-related technical problems. 

For more information, please call (800) 759-0105 (in 
the United States and Canada). Outside North Ameri- 
ca, please contact your local National Semiconductor 
office. 

Ordering Information 

Supported Host Environments and Order Codes: 
SYS32/20: 

NSW-ASM-3-BHAF3 (included with SYS32/20 kit) 
SYS32/30: 

NSW-ASM-3-BHBF3 (included with SYS32/30 kit) 

VAX/VMS: 

NSW-ASM-3-BRVM 
VAX/ULTRIX (UNIX bsd): 

NSW-ASM-3-BRVX 
Micro VAX/VMS: 

NSW-ASM-3-BCVM 


Micro VAX/ULTRIX: 

NSW-ASM-3-BCVX 

Sun-3: 

NSW-ASM-3-BCSX 

Each software package is delivered with one copy of 
each appropriate manual. Additional manual sets may 
be ordered using the following order codes: 

NSP-ASM-NX3-MS: 

Manual set included with NSW-ASM-3-BHAF3 and 

NSW-ASM-3-BHBF3 

NSP-ASM-X3-MS: 

Manual set included with NSW-ASM-3-BRVX, NSW- 
ASM-3-BCVX, and NSW-ASM-3-BCSX 

NSP-ASM-M3-MS: 

Manual set included with NSW-ASM-3-BRVM and 
NSW-ASM-3-BCVM 

NSP-C-V3-M: 

Manual set delivered with Optimizing C compiler (all 
hosts) 

NSP-F77-V3-M: 

Manual set delivered with Optimizing FORTRAN 77 
compiler (all hosts) 

NSP-PAS-V3-M: 

Manual set delivered with Optimizing Pascal compiler 
(all hosts) 

For further information regarding National Semicon- 
ductor’s software development tools and develop- 
ment hosts, please refer to the following datasheets: 
GNX-Version 3 C Optimizing Compiler 
GNX-Version 3 FORTRAN 77 Optimizing Compiler 
GNX-Version 3 Pascal Optimizing Compiler 
SYS32/20 PC-Add-ln Development Package 
SYS32/30 PC-Add-ln Development Package 
SPLICE Development Tool 
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■ Generates high-quality code for the 
Series 32000 architecture 

■ Implements the C Language as defined 
by B. Kernighan and D. Ritchie in The C 
Programming Language 

■ Uses state-of-the-art optimization 
techniques 


■ Supports mixed-language programming 

■ Includes a complete run-time C library 
and highly optimized math library 

■ Incorporates many draft-proposed ANSI 
C standard (X3J11) features 

■ Compiles under UNIX®, ULTRIXtm, and 
VMS™ operating systems 


1.0 Introduction 

A substantial amount of application code is developed 
in a high-level language. Therefore, the speed and ef- 
ficiency of the application are functions not only of 
processor speed, but also of quality of code generat- 
ed by the high-level language compiler. An inefficient 
compiler can extract a significant performance penal- 
ty. Likewise, a significant performance improvement 
can be achieved for much lower cost in software rath- 
er than hardware. For this reason, National Semicon- 
ductor has developed a line of optimizing compilers 
that generate extremely efficient code for the Series 
32000 architecture. 

1.1 Product Overview 

The Series 32000 GNX-Version 3 C Optimizing Com- 
piler is a member of National Semiconductor’s opti- 
mizing compiler family, which also includes compilers 
that support the Pascal and FORTRAN 77 program- 
ming languages. Because all three optimizing compil- 
ers use a standard calling sequence, internal interme- 
diate representation, and object file format, mixed-lan- 
guage programming is greatly simplified. The ability to 
use mixed-language programming simplifies the port- 
ing of pre-existing applications and code reuse. A de- 
tailed discussion of mixed-language programming is 
presented in the GNX-Version 3 C Optimizing Compil- 
er Reference Manual. 

The C Optimizing Compiler fully implements the C 
Language, as defined by B. Kernighan and D. Ritchie. 


The C Optimizing Compiler is also compatible with the 
UNIX Systtem V C compiler, derived from the fully por- 
table C compiler (pcc). Several features of the draft 
ANSI C standard (X3J11) are supported. 

The input to the C Optimizing Compiler is a C lan- 
guage source program. The output, controlled by 
command-line options, is either a Series 32000 exe- 
cutable module, a Series 32000 object module, or Se- 
ries 32000 assembly code. 

1.2 Native and Cross-Support 

The GNX-Version 3 C Optimizing Compiler is available 
hosted as a cross-support compiler on the VAX - ™ se- 
ries of computers, running the VMS, UNIX (bsd), and 
ULTRIX operating systems and on a Sun-3® worksta- 
tion running SunOS™. Also supported are National 
Semiconductor’s SYS32TM/20 and SYS32/30 devel- 
opment environments. 

1.3 GNX Development Tools 

The GNX-Version 3 C Optimizing Compiler is an inte- 
gral component of the GNX Cross-Development tool 
set. The GNX-Version 3 Assembler Package includes 
the Series 32000 assembler, the GNX linker, debug- 
gers, libraries, and development board monitors. The 
GNX-Version 3 Assembler Package is a prerequisite 
for the GNX-Version 3 C Optimizing Compiler. See the 
GNX-Version 3 Development Tools Datasheet for 
more information on the GNX Tools. 
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1.0 Introduction (Continued) 

The SYS32/20 and SYS32/30 PC-Add-ln Develop- 
ment Packages are complete, high-performance 
packages that convert an IBM®-PCtm/at or compati- 
ble computer into a powerful multi-user system for de- 
veloping applications that use the Series 32000 fami- 
ly. The SYS32 systems are based on the Series 
32000 processor family; the SYS32/20 includes an 
NS32032 Central Processing Unit, and the SYS32/30 
is based on the NS32332 CPU. Both the SYS32/20 
and SYS32/30 run a derivative of the UNIX System 
V.3 operating system. Because these host systems 
are themselves based on the Series 32000 processor 
family, application code can be debugged on the host 
system without down-loading to target hardware. 

2.0 Compiler Structure 

The C Optimizing Compiler is a modular language 
processor consisting of five separate programs: the 
driver, the macro preprocessor (cpp), the parser (front 
end), the optimizer, and the code generator. 

2.1 The Driver 

The driver is a program that parses and interprets the 
command line and, in turn, sequentially calls each of 
the other programs, based on its input and the com- 
mand-line options invoked. Under the UNIX operating 
system, the assembler and linker are also automati- 
cally invoked by the driver as required; under VMS, 
the assembler is invoked by the driver, and linking is 
done at the command line. 

2.2 The Macro Preprocessor (cpp) 

The macro preprocessor is the standard C preproces- 
sor, known as cpp. The macro preprocessor’s input is 
the C source program with preprocessor macros; its 
output is processed C code, with all preprocessor 
commands expanded and transformed as necessary. 
The macro preprocessor can be used to define con- 
stants, insert text from another file, or conditionally 
include or exclude source code from compilation 
based on a testable condition. 

2.3 The C Language Parser (front end) 

The front end of the C Optimizing Compiler is derived 
from the UNIX portable C compiler (pcc), with bug fix- 
es and extensions included. The front end’s input is C 
source code; its output is an intermediate representa- 
tion that can be passed either to the optimizer or the 
code generator. 

Among the extensions implemented in the front end 
are: 

• Unsigned constants 

• Enumerated types 

• Improved structure manipulation; structures can be 
assigned, passed as parameters to functions, and 
returned by functions. Structure and union member 
names can be reused in other structures and un- 
ions in the same module. No limit is imposed on the 
size of structures. 


• Void data type 

• Signed and unsigned bitfields 

• Volatile type; variables can be declared as type 
volatile to make them inaccessible to the optimiz- 
er. This is useful for mapping to external devices. 

• Const keyword 

The void, volatile, and const extensions conform to 
ANSI C standard (X3J11) features. 

The output of the front end is a proprietary intermedi- 
ate representation that can be either used as input to 
the optional optimizer phase or passed directly to the 
code generator. This intermediate language, known 
as IR32, is an attributed tree-structured representa- 
tion. IR32 is completely high-level language indepen- 
dent; all of the GNX optimizing compilers produce the 
same internal representation. This allows a common 
back end to be shared by all GNX optimizing compil- 
ers. 

2.4 The Optimizer 

The state-of-the-art GNX optimizer is based on ad- 
vanced optimization theory developed over the past 
15 years. Depending on the compiler and application 
code characteristics, the GNX optimizer improves 
code performance from 1 5 to 200 percent beyond that 
of other compilers. 

The GNX-Version 3 C optimizer is the most innovative 
component of the GNX Optimizing Compilers. The op- 
timizer’s input is an IR32 intermediate representation 
file; its output is an optimized IR32 file. The optimiza- 
tion pass is optional. 

Unlike many other optimizers that are local in nature, 
optimizations are performed across the whole pro- 
gram by using sophisticated global-data-flow analysis. 
The optimization process can be thought of as a five- 
step sequence. The sequence of optimizations has 
been carefully chosen to ensure that each optimiza- 
tion is performed to maximum effect and to provide 
more opportunities for later optimizations. These 
steps are as follows: 

Step One — Local Optimizations 
The source program is read-in one procedure at a 
time. A procedure is then partitioned into basic blocks: 
sequences of code that have branches only at entry 
or exit. Optimizations performed at this stage include: 

• Value Propagation — replacing variables with their 
most recent values 

• Constant Folding— evaluating expressions that 
consist solely of constants 

• Redundant Assignment Elimination — eliminating 
assignments that are not used or that are reas- 
signed prior to use 
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2.0 Compiler Structure (Continued) 

The relationships between the various optimizations 
are illustrated as follows: 


b = 15; 


The program Sequence 
a = 4 ; 

if (a*8 < 0) 
else b = 20 ; 

... code which uses b but 
not a ... 

is translated by the compiler front end into the fol- 
lowing intermediate code 
a <— 4 

if (a*8 >= 0) goto LI 
b<- 15 
goto L2 
b < — 20 


LI: 

L2: 


which is transformed by “value propagation” into 
a 4 

>= 0) goto LI 


LI: 

L2: 


if (4*8 
b<- 15 
goto L2 
b 20 


which after “constant folding” becomes 
a <— 4 

if (true) goto LI 
b «- 15 
goto L2 
b — 20 


LI: 

L2: 


“dead code removal” results in 
a <— 4 
goto LI 
LI: b<— 20 
L2: ... 


which is transformed by another “flow optimiza- 
tion” into 


4 

20 


Since there is no further use of a, a 
dundant assignment:” 
b < — 20 


4 is a "re- 


step Two — Flow Optimizations 
A flow graph is constructed. Each basic block is a 
node in the graph, with “arrows” drawn to represent 


program flow. Optimizations performed at this stage 
include: 

• Branch Elimination— branches to branches are 
removed. Code may be reordered to eliminate 
branches. 

• Dead Code Removal — code that will never be ex- 
ecuted is removed. 

The following diagram is an example of a flow graph: 
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Step Three — GIobal-Data-Flow Analysis 
Global-data-flow analysis is a process that identifies 
desirable global code transformations that can speed 
code execution. Since studies have shown that most 
programs spend 90 percent or more of their time in 
loops, particular attention is paid to transformations 
that allow loops to execute faster. This involves sever- 
al techniques: 

• Fully Redundant Expression Elimination— Ex- 
pressions that are computed twice on the same 
path are instead computed only once, with the re- 
sult saved, usually in a register. 

• Partially Redundant Expression Elimination — If 
a path exists that contains a computation and a 
path exists that does not contain a computation, 
the computation is placed in each path. This makes 
the expression fully redundant, allowing it to be 
eliminated. 

• Loop Invariant Code Motion — Values that are 
computed repeatedly inside of a loop are instead 
computed outside the loop and the result saved. 

• Strength Reduction — Complex instructions are 
replaced by simpler substitutes (i.e., multiplications 
may be replaced with a sequence of additions). 

• Induction Variable Elimination — Variables that 
maintain a fixed relation to other variables are re- 
placed. 

Step Four— Register Allocation 
Register allocation is the process of placing variables 
in registers rather than main memory, allowing much 
faster access times. Proper allocation of registers can 
lead to significant improvement in execution speed. 
Most optimizing compilers attempt register allocation 
for local variables, to avoid problems caused by “ali- 
asing,” or referring to a variable in more than one way. 
By using a sophisticated algorithm, the GNX-Version 3 
C Optimizing Compiler considers nearly all variables 
as candidates for register allocations. 
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2.0 Compiler Structure (Continued) 

The algorithm used by the optimizer is called the col- 
oring algorithm, derived from graph theory. The “live 
range” of each variable is constructed. The live range 
is the program path along which a variable has a val- 
ue; assignment to a variable generally starts a new 
live range, which terminates with the last use of that 
value. Two variables that do not have intersecting live 
ranges can share a register. More frequently used 
variables are given priority for register allocation. In 
this way, maximum usage can be made of the regis- 
ters. Other optimizations performed at this stage are: 

• Allocation Of Safe And Scratch Registers— By 
convention, registers R0 through R2 and F0 
through F3 are considered “scratch” registers; 
their values are not retained across procedure 
calls. Usage of these registers can reduce over- 
head of procedure calls. 

• Register Parameter Allocation— For static rou- 
tines, parameters are passed in registers whenever 
possible. 

Step Five — Code Rewrite 

Code is rewritten in IR32 to be passed to the code 
generator. Code is reorganized where necessary to 
increase performance. 

2.5 The Code Generator 

The code generator’s input is an IR32 file; its output is 
assembly code that can be assembled by the GNX 
assembler into an object module. 

The code generator matches expression trees with 
optimal code sequences. Several “peephole” opti- 
mizations are performed by the code generator: fur- 
ther reduction of arithmetic identities, stack and frame 
alignments, and strength reductions. 

In addition, the target CPU and FPU are taken into 
consideration when code is produced. Sequences of 
code are chosen based on the characteristics of the 


target processor specified by the user. This further in- 
creases code efficiency. 

3.0 Ordering Information 

Supported Host Environments and Order Codes: 
SYS32/20: MicroVAX/VMS: 

NSW-C-3-BHAF3 NSW-C-3-BCVM 

SYS32/30: MIcroVAX/ULTRIX: 

NSW-C-3-BHBF3 NSW-C-3-BCVX 

VAX/VMS: Sun-3: 

NSW-C-3-BRVM NSW-C-3-BCSX 

VAX/ULTRIX (UNIX bsd): 

NSW-C-3-BRVX 

GNX-Version 3 Assembler and Cross-Development 
tools (required for use with the Optimizing C Compil- 
er): 

SYS32/30: NSW-ASM-3-BHAF3 (provid- 

ed with SYS32/20 system) 
SYS32/30: NSW-ASM-3-BHBF3 (provid- 

ed with SYS32/30 system) 
VAX/VMS: NSW-ASM-3-BRVM 

VAX/ULTRIX 

(UNIX bsd:) NSW-ASM-3-BRVX 

MicroVAX/VMS: NSW-ASM-3-BCVM 

MicroVAX/ULTRIX: NSW-ASM-3-BCVX 

Sun-3: NSW-ASM-3-BCSX 

For further information regarding National Semicon- 
ductor’s software development tools and develop- 
ment hosts, please refer to the following datasheets: 
GNX-Version 3 Development Tools 
GNX-Version 3 FORTRAN 77 Compiler 
GNX-Version 3 Pascal Compiler 
SYS32/20 PC-Add-ln-Development Package 
SYS32/30 PC-Add-ln-Development Package 
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Series 32000® GNX-Version 3 
FORTRAN 77 Optimizing Compiler 


FORTRAN 77 





Optimizer 


■ Generates high-quality code for the 
Series 32000 architecture 

■ Implements the FORTRAN 77 Language 
as described by the American Standard 
publication Programming Language 
FORTRAN (ANSI X3.9- 1978) 

■ Uses state-of-the-art optimization 
techniques 


1.0 Introduction 

A substantial amount of application code is developed 
in a high-level language. Therefore, the speed and ef- 
ficiency of the application are functions not only of 
processor speed, but also of quality of code generat- 
ed by the high-level language compiler. An inefficient 
compiler can extract a significant performance penal- 
ty. Likewise, a significant performance improvement 
can be achieved for much lower cost in software rath- 
er than hardware. For this reason, National Semicon- 
ductor has developed a line of optimizing compilers 
that generate extremely efficient code for the Series 
32000 architecture. 

1.1 Product Overview 

The Series 32000 GNX-Version 3 FORTRAN 77 Opti- 
mizing Compiler is a member of National Semiconduc- 
tor’s optimizing compiler family, which also includes 
compilers that support the C and Pascal programming 
languages. Because all three optimizing compilers use 
a standard calling sequence, internal intermediate 
representation, and object file format, mixed-language 
programming is greatly simplified. The ability to use 
mixed-language programming simplifies the porting of 
pre-existing applications and code reuse. A detailed 
discussion of mixed-language programming is pre- 
sented in the GNX-Version 3 FORTRAN 77 Optimiz- 
ing Compiler Reference Manual. 

The FORTRAN 77 Optimizing Compiler fully imple- 
ments the FORTRAN 77 programming language, as 



Code 

Generator 


Assembly 

Code 

TL/EE/1 0362-1 


■ Supports mixed-language programming 

■ Includes complete FORTRAN intrinsic 
function and I/O libraries 

■ Implements many extensions to 
standard FORTRAN 77 

■ Compiles under UNIX®, ULTRIXtm, and 
VMS™ operating systems 


defined by the American Standard publication Pro- 
gramming Language FORTRAN (ANSI X3. 9-1 978). In 
addition, a command-line option is provided that 
forces the compiler to accept as input only programs 
that adhere to the FORTRAN 66 standard. 

The input to the FORTRAN 77 Optimizing Compiler is 
a FORTRAN 77 language source program. The out- 
put, controlled by command-line options, is either a 
Series 32000 executable module, a Series 32000 ob- 
ject module, or Series 32000 assembly code. 

1.2 Native and Cross-support 

The GNX-Version 3 FORTRAN 77 Optimizing Compil- 
er is available hosted as a cross-support compiler on 
the VAXtm series of computers, running the VMS, 
UNIX (bsd), and ULTRIX operating systems. Also sup- 
ported are National Semiconductor’s SYS32TM/20 
and SYS32/30 development environments. 

1.3 GNX Development Tools 

The GNX-Version 3 FORTRAN 77 Optimizing Compil- 
er is an integral component of the GNX Cross-devel- 
opment tool set. The GNX-Version 3 Assembler Pack- 
age includes the Series 32000 assembler, the GNX 
linker, debuggers, libraries, and development board 
monitors. The GNX-Version 3 Assembler Package is a 
prerequisite for the GNX-Version 3 FORTRAN 77 Op- 
timizing Compiler. See the GNX-Version 3 Develop- 
ment Tools Datasheet for more information on the 
GNX Tools. 
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1.0 Introduction (Continued) 

The SYS32/20 and SYS32/30 PC-Add-ln Develop- 
ment Packages are complete, high-performance 
packages that convert an IBM®-PCtm/AT or compati- 
ble computer into a powerful multi-user system for de- 
veloping applications that use the Series 32000 fami- 
ly. The SYS32 systems are based on the Series 
32000 processor family; the SYS32/20 includes an 
NS32032 Central Processing Unit, and the SYS32/30 
is based on the NS32332 CPU. Both the SYS32/20 
and SYS32/30 run a derivative of the UNIX System 
V.3 operating system. Because these host systems 
are themselves based on the Series 32000 processor 
family, application code can be debugged on the host 
system without down-loading to target hardware. 

2.0 Compiler Structure 

The FORTRAN 77 Optimizing Compiler is a modular 
language processor consisting of five separate pro- 
grams: the driver, the macro preprocessor (cpp), the 
parser (front end), the optimizer, and the code genera- 
tor. 

2.1 The Driver 

The driver is a program that parses and interprets the 
command line and, in turn, sequentially calls each of 
the other programs, based on its input and the com- 
mand-line options invoked. Under the UNIX operating 
system, the assembler and linker are also automati- 
cally invoked by the driver as required; under VMS, 
the assembler is invoked by the driver, and linking is 
done at the command line. 

2.2 The Macro Preprocessor (cpp) 

The macro preprocessor is the standard C-language 
preprocessor, known as cpp. Preprocessing is an op- 
tional step and is performed only if macros are defined 
in the FORTRAN 77 source code. The macro preproc- 
essor’s input is the FORTRAN 77 program with pre- 
processor macros; its output is processed FORTRAN 
77 code, with all preprocessor commands expanded 
and transformed as necessary. The macro preproces- 
sor can be used to define constants, insert text from 
another file, or conditionally include or exclude source 
code from compilation based on a testable condition. 

2.3 FORTRAN 77 Language Parser (front end) 

The FORTRAN 77 language parser, known as 

f77 fe, takes as input a FORTRAN 77 program. The 

output is an intermediate representation that can be 
passed either to the optimizer or the code generator. 
Several extensions to standard FORTRAN are imple- 
mented in the FORTRAN 77 language parser. 

Among the extensions implemented in the front end 
are: 

• Double Complex data type; each datum is repre- 
sented by a pair of double-precision real variables. 

• Short Integer data type; declarations of type 
integer* 2 are accepted 


• Hollerith (nh) notation 

• Variable-length program lines 

• unlimited identifier length and underscores in iden- 
tifier names 

• non-integer constants (binary, octal, and hexadeci- 
mal) 

• recursion; procedures may call themselves directly 
or through a chain of other procedures 

Note: A command-line option is provided that will force the compiler to 
accept only code that conforms to the FORTRAN 77 (or 
FORTRAN 66) standard (ANSI X3.9-1978). 

The output of the front end is a proprietary intermedi- 
ate representation that can be either used as input to 
the optional optimizer phase or passed directly to the 
code generator. This intermediate language, known 
as IR32, is an attributed tree-structured representa- 
tion. IR32 is completely high-level language indepen- 
dent; all of the GNX optimizing compilers produce the 
same internal representation. This allows a common 
back end to be shared by all GNX optimizing compil- 
ers. 

2.4 The Optimizer 

The state-of-the-art GNX optimizer is based on ad- 
vanced optimization theory developed over the past 
15 years. Depending on the compiler and application 
code characteristics, the GNX optimizer improves 
code performance from 1 5 to 200 percent beyond that 
of other compilers. 

The GNX-Version 3 FORTRAN 77 optimizer is the 
most innovative component of the GNX Optimizing 
Compilers. The optimizer’s input is an IR32 intermedi- 
ate representation file; its output is an optimized IR32 
file. The optimization pass is optional. 

Unlike many other optimizers that are local in nature, 
optimizations are performed across the whole pro- 
gram by using sophisticated global-data-flow analysis. 
The optimization process can be throught of as a five- 
step sequence. The sequence of optimizations has 
been carefully chosen to ensure that each optimiza- 
tion is performed to maximum effect and to provide 
more opportunities for later optimizations. These 
steps are as follows: 

Step One — Local Optimizations 

The source program is read-in one procedure at a 

time. A procedure is then partitioned into basic 

blocks: sequences of code that have branches only 

at entry or exit. Optimizations performed at this stage 

include: 

• Value Propagation— replacing variables with their 
most recent values 

• Constant Folding— evaluating expressions that 
consist solely of constants 

• Redundant Assignment Elimination— eliminating 
assignments that are not used or that are reas- 
signed prior to use 
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2.0 Compiler Structure (Continued) 

The relationships between the various optimizations 
are illustrated as follows: 


The program Sequence 
a = 4 

IF (a * 8 .LT. 0) THEN 
b = 15 
ELSE 
b = 20 
ENDIF 

. . . code which uses b but not a .. . 
is translated by the Compiler front end into the fol- 
lowing intermediate code 
a <— 4 

if (a * 8 >= 0) goto LI 
b <— 15 
goto L2 
LI: b <—20 
L2: ... 

which is transformed by “value propagation” into 
a <— 4 

if (4 * 8 >= 0) goto LI 
b <— 15 
goto L2 
LI: b<-20 
L2: ... 

which after “constant folding” becomes 
a <— 4 

if (true) goto LI 
b <— 15 
goto L2 
LI: b<-20 
L2: ... 

“dead code removal” results in 
a <— 4 
goto LI 
LI: b 20 
L2: ... 

which is transformed by another “flow optimiza- 
tion” into 

a <— 4 
b <-20 

Since there is no further use of a, a <— 4 is a "re- 
dundant assignment:" 

b <— 20 


Step Two — Flow Optimizations 

A flow graph is constructed. Each basic block is a 
node in the graph, with “arrows” drawn to represent 


program flow. Optimizations performed at this stage 
include: 

• Branch elimination— branches to branches are 
removed. Code may be reordered to eliminate 
branches. 

• Dead code removal — code that will never be exe- 
cuted is removed. 

The following diagram is an example of a flow graph: 
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Step Three — Global-Data-Flow Analysis 
Global-data-flow analysis is a process that identifies 
desirable global code transformations that can speed 
code execution. Since studies have shown that most 
programs spend 90 percent or more of their time in 
loops, particular attention is paid to transformations 
that allow loops to execute faster. This involves sever- 
al techniques: 

• Fully redundant expression elimination — Ex- 
pressions that are computed twice on the same 
path are instead computed only once, with the re- 
sult saved, usually in a register. 

• Partially redundant expression elimination— If a 
path exists that contains a computation and a path 
exists that does not contain a computation, the 
computation is placed in each path. This makes the 
expression fully redundant, allowing it to be elimi- 
nated. 

• Loop invariant code motion— Values that are 
computed repeatedly inside of a loop are instead 
computed outside the loop and the result saved. 

• Strength reduction— Complex instructions are re- 
placed by simpler substitutes (i.e., multiplications 
may be replaced with a sequence of additions). 

• Induction variable elimination — Variables that 
maintain a fixed relation to other variables are re- 
placed. 

Step Four — Register Allocation 
Register allocation is the process of placing variables 
in registers rather than main memory, allowing much 
faster access times. Proper allocation of registers can 
lead to significant improvement in execution speed. 
Most optimizing compilers attempt register allocation 
for local variables, to avoid problems caused by “ali- 
asing,” or referring to a variable in more than one way. 
By using a sophisticated algorithm, the GNX-Version 3 
FORTRAN 77 Optimizing Compiler considers nearly 
ail variables as candidates for register allocations. 
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2.0 Compiler Structure (Continued) 

The algorithm used by the optimizer is called the col- 
oring algorithm, derived from graph theory. The “live 
range” of each variable is constructed. The live range 
is the program path along which a variable has a val- 
ue; assignment to a variable generally starts a new 
live range, which terminates with the last use of that 
value. Two variables that do not have intersecting live 
ranges can share a register. More frequently used 
variables are given priority for register allocation. In 
this way, maximum usage can be made of the regis- 
ters. Other optimizations performed at this stage are: 

• Allocation of safe and scratch registers— By 
convention, registers R0 through R2 and F0 
through F3 are considered “scratch” registers; 
their values are not retained across procedure 
calls. Usage of these registers can reduce over- 
head of procedure calls. 

• Register Parameter Allocation — for static rou- 
tines, parameters are passed in registers whenever 
possible. 

Step Five — Code Rewrite 

Code is rewritten in IR32 to be passed to the code 
generator. Code is reorganized where necessary to 
increase performance. 

2.5 The Code Generator 

The code generator’s input is an IR32 file; its output is 
assembly code that can be assembled by the GNX 
assembler into an object module. 

The code generator matches expression trees with 
optimal code sequences. Several “peephole” opti- 
mizations are performed by the code generator; fur- 
ther reduction of arithmetic identities, stack and frame 
alignments, and strength reductions. 


In addition, the target CPU and FPU are taken into 
consideration when code is produced. Sequences of 
code are chosen based on the characteristics of the 
target processor specified by the user. This further in- 
creases code efficiency. 


3.0 Ordering Information 

Supported Host Environments and Order Codes: 
SYS32/20: VAX/ULTRIX (UNIX bsd): 

NSW-F77-3-BHAF3 NSW-F77-3-BRVX 


SYS32/30: 

NSW-F77-3-BHBF3 

VAX/VMS: 

NSW-F77-3-BRVM 


Micro VAX/VMS: 
NSW-F77-3-BCVM 
Micro VAX/ULTRIX: 

NSW-F77-3-BCVX 


GNX-Version 3 Assembler and Cross-development 
tools (required for use with the Optimizing FORTRAN 
77 Compiler): 

SYS32/30: NSW-ASM-3-BHAF3 

(provided with SYS32/20 
system) 

SYS32/30: NSW-ASM-3-BHBF3 

(provided with SYS32/30 
system) 

VAX/VMS: NSW-ASM-3-BRVM 

VAX/ULTRIX (UNIX bsd): NSW-ASM-3-BRVX 
Micro VAX/VMS: NSW-ASM-3-BCVM 

Micro VAX/ULTRIX: NSW-ASM-3-BCVX 


For further information regarding National Semicon- 
ductor’s software development tools and develop- 
ment hosts, please refer to the following datasheets: 
GNX-Version 3 Development Tools 
GNX-Version 3 C Compiler 
GNX-Version 3 Pascal Compiler 
SYS32/20 PC-Add-ln-Development Package 
SYS32/30 PC-Add-ln-Development Package 
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■ Generates high-quality code for the 
Series 32000 architecture 

■ Implements the Pascal Language as 
described by the International Standards 
Organization (ISO) standard ISO dp7185 
level 1 

■ Uses state-of-the-art optimization 
techniques 


■ Supports mixed-language programming 

■ Includes a complete Pascal run-time 
library and highly optimized math library 

■ Implements many extensions to 
standard Pascal 

■ Compiles under UNIX®, ULTRIX™ and 
VMS™ operating systems 


1.0 Introduction 

A substantial amount of application code is developed 
in a high-level language. Therefore, the speed and ef- 
ficiency of the application are functions not only of 
processor speed, but also of quality of code generat- 
ed by the high-level language compiler. An inefficient 
compiler can extract a significant performance penal- 
ty. Likewise, a significant performance improvement 
can be achieved for much lower cost in software rath- 
er than hardware. For this reason, National Semicon- 
ductor has developed a line of optimizing compilers 
that generate extremely efficient code for the Series 
32000 architecture. 

1.1 Product Overview 

The Series 32000 GNX-Version 3 Pascal Optimizing 
Compiler is a member of National Semiconductor’s 
optimizing compiler family, which also includes compil- 
ers that support the C and FORTRAN 77 program- 
ming languages. Because all three optimizing compil- 
ers use a standard calling sequence, internal interme- 
diate representation, and object file format, mixed-lan- 
guage programming is greatly simplified. The ability to 
use mixed-language programming simplifies the port- 
ing of pre-existing applications and code reuse. A de- 
tailed discussion of mixed-language programming is 
presented in the GNX-Version 3 Pascal Optimizing 
Compiler Reference Manual. 


The Pascal Optimizing Compiler fully implements the 
Pascal programming language, as defined by the In- 
ternational Standards Organization (ISO) standard 
ISO dp7185 level 1 , with several useful extensions to 
the compiler extensions found in the University of Cali- 
fornia, Berkeley Pascal compiler (pc). In addition, a 
command-line option is provided that forces the com- 
piler to accept as input only programs that adhere to 
the ISO standard. 

The input to the Pascal Optimizing Compiler is a Pas- 
cal language source program. The output, controlled 
by command-line options, is either a Series 32000 ex- 
ecutable module, a Series 32000 object module, or 
Series 32000 assembly code. 

1.2 Native and Cross-Support 

The GNX-Version 3 Pascal Optimizing Compiler is 
available hosted as a cross-support compiler on the 
VAX™ series of computers, running the VMS, UNIX 
(bsd), and ULTRIX operating systems. Also supported 
are National Semiconductor’s SYS 32™/20 and 
SYS32/30 development environments. 

1.3 GNX Development Tools 

The GNX-Version 3 Pascal Optimizing Compiler is an 
integral component of the GNX Cross-development 
tool set. The GNX-Version 3 Assembler Package in- 
cludes the Series 32000 assembler, the GNX linker, 
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1.0 Introduction (Continued) 

debuggers, libraries, and development board moni- 
tors. The GNX-Version 3 Assembler Package is a pre- 
requisite for the GNX-Version 3 Pascal Optimizing 
Compiler. See the GNX-Version 3 Development Tools 
Datasheet for more information on the GNX Tools. 
The SYS32/20 and SYS32/30 PC-Add-ln Develop- 
ment Packages are complete, high-performance 
packages that convert an IBM-PCWAT or compati- 
ble computer into a powerful multi-user system for de- 
veloping applications that use the Series 32000 fami- 
ly. The SYS32 systems are based on the Series 
32000 processor family; the SYS32/20 includes an 
NS32032 Central Processing Unit, and the SYS32/30 
is based on the NS32332 CPU. Both the SYS32/20 
and SYS32/30 run a derivative of the UNIX System 
V.3 operating system. Because these host systems 
are themselves based on the Series 32000 processor 
family, application code can be debugged on the host 
system without down-loading to target hardware. 

2.0 Compiler Structure 

The Pascal Optimizing Compiler is a modular lan- 
guage processor consisting of five separate programs: 
the driver, the macro preprocessor (cpp), the parser 
(front end), the optimizer, and the code generator. 

2.1 The Driver 

The driver is a program that parses and interprets the 
command line and, in turn, sequentially calls each of 
the other programs, based on its input and the com- 
mand-line options invoked. Under the UNIX operating 
system, the assembler and linker are also automati- 
cally invoked by the driver as required; under VMS, 
the assembler is invoked by the driver, and linking is 
done at the command line. 

2.2 The Macro Preprocessor (cpp) 

The macro preprocessor is the standard C-language 
preprocessor, known as cpp. Preprocessing is an op- 
tional step and is performed only if macros are defined 
in the Pascal source code. The macro preprocessor’s 
input is the Pascal program with preprocessor macros; 
its output is processed Pascal code, with all preproc- 
essor commands expanded and transformed as nec- 
essary. The macro preprocessor can be used to de- 
fine constants, insert text from another file, or condi- 
tionally include or exclude source code from compila- 
tion based on a testable condition. 

2.3 The Pascal Language Parser (front end) 

The Pascal language parser, known as pas_fe, takes 
as input a Pascal program. The output is an intermedi- 
ate representation that can be passed either to the 
optimizer or the code generator. Conformant array pa- 
rameters, as defined in the ISO level 1 Standard, are 
fully supported. Several extensions to standard Pascal 
are implemented in the Pascal language parser. 


Among the extensions implemented in the front end 
are: 

• Separate compilation; programs can be divided into 
a number of files that can be compiled separately 

• Longreal data type; double-precision (64-bit) float- 
ing point values 

• String padding of constant strings with blanks 

• Conversions of pointers to integers and vice versa 

• Unlimited identifier length and underscores in iden- 
tifier names 

• Non-integer constants (binary, octal, and hexadeci- 
mal) 

• Constant expressions; constants can be defined in 
terms of mathematical expressions 

• predefined argc and argv functions; allows appli- 
cation programs to easily accept and process com- 
mand-line arguments 

Note: A command-line option is provided that will force the compiler to 
accept only code that conforms to the ISO Pascal standard ISO 
dp71 85 level 1. 

The output of the front end is a proprietary intermedi- 
ate representation that can be either used as input to 
the optional optimizer phase or passed directly to the 
code generator. This intermediate language, known 
as IR32, is an attributed tree-structured representa- 
tion. IR32 is completely high-level language indepen- 
dent; all of the GNX optimizing compilers produce the 
same internal representation. This allows a common 
back end to be shared by all GNX optimizing compil- 
ers. 

2.4 The Optimizer 

The state-of-the-art GNX optimizer is based on ad- 
vanced optimization theory developed over the past 
1 5 years. Depending on the compiler and application 
code characteristics, the GNX optimizer improves 
code performance from 1 5 to 200 percent beyond that 
of other compilers. 

The GNX-Version 3 Pascal optimizer is the most inno- 
vative component of the GNX Optimizing Compilers. 
The optimizer’s input is an IR32 intermediate repre- 
sentation file; its output is an optimized IR32 file. The 
optimization pass is optional. 

Unlike many other optimizers that are local in nature, 
optimizations are performed across the whole pro- 
gram by using sophisticated global-data-flow analysis. 
The optimization process can be thought of as a five- 
step sequence. The sequence of optimizations has 
been carefully chosen to ensure that each optimize is 
performed to maximum effect and to provide more op- 
portunities for later optimizations. These steps are as 
follows: 

Step One — Local Optimizations 
The source program is read-in one procedure at a 
time. A procedure is then partitioned into basic 
blocks: sequences of code that have branches only 
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at entry or exit. Optimizations performed at this stage 

include: 

• Value Propagation — replacing variables with their 
most recent values 

• Constant Folding — evaluating expressions that 
consist solely of constants 

• Redundant Assignment Elimination — eliminating 
assignments that are not used or that are reas- 
signed prior to use 

Step Two — Flow Optimizations 
A flow graph is constructed. Each basic block is a 
node in the graph, with “arrows” drawn to represent 
program flow. Optimizations performed at this stage 
include: 

• Branch elimination— branches to branches are 
removed. Code may be reordered to eliminate 
branches. 

• Dead code removal — code that will never be exe- 
cuted is removed. 

The following diagram is an example of a flow graph: 
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Step Three — Global-Data-Flow Analysis 
Global-data-flow analysis is a process that identifies 
desirable global code transformations that can speed 
code execution. Since studies have shown that most 
programs spend 90 percent or more of their time in 
loops, particular attention is paid to transformations 
that allow loops to execute faster. This involves sever- 
al techniques: 

• Fully redundant expression elimination— Ex- 
pressions that are computed twice on the same 
path are instead computed only once, with the re- 
sult saved, usually in a register. 

• Partially redundant expression elimination — If a 
path exists that contains a computation and a path 
exists that does not contain a computation, the 
computation is placed in each path. This makes the 
expression fully redundant, allowing it to be elimi- 
nated. 

• Loop invariant code motion — Values that are 
computed repeatedly inside of a loop are instead 
computed outside the loop and the result saved. 

• Strength reduction — Complex instructions are re- 
placed by simpler substitutes (i.e., multiplications 
may be replaced with a sequence of additions). 

• Induction variable elimination— Variables that 
maintain a fixed relation to other variables are re- 
placed. 


The relationship between the various optimizations 
are illustrated as follows: 


The program sequence 
a := 4; 

if (a * 8 < 0) then b := 15; 
b := 20; 

. . . code which uses b but not a .. . 
is translated by the Compiler front end into the fol- 
lowing intermediate code 
a <— 4 

if (a * 8 > = 0) goto Li 
b«-15 
goto L2 
LI: b<— 20 
L2: . . . 

which is transformed by “value propagation” into 
a <— 4 

if (4 * 8 > = 0) goto LI 
b 15 
goto L2 
LI: b<— 20 
L2: . . . 

which after “constant folding” becomes 
a 4 

if (true) goto LI 
b 15 
goto L2 
LI : b 20 
L2: . . . 

“dead code removal” results in 
a *— 4 
goto LI 
LI: b<-20 
L2: . . . 

which is transformed by another "flow optimiza- 
tion” into 
a «— 4 
b<— 20 

Since there is no further use of a, a <— 4 is a “re- 
dundant assignment:” 
b < — 20 


Step Four— Register Allocation 
Register allocation is the process of placing variables 
in registers rather than main memory, allowing much 
faster access times. Proper allocation of registers can 
lead to significant improvement in execution speed. 
Most optimizing compilers attempt register allocation 
for local variables, to avoid problems caused by “ali- 
asing,” or referring to a variable in more than one way. 
By using a sophisticated algorithm, the GNX-Version 3 
Pascal Optimizing Compiler considers nearly all vari- 
ables as candidates for register allocations. 
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The algorithm used by the optimizer is called the col- 
oring algorithm, derived from graph theory. The “live 
range” of each variable is constructed. The live range 
is the program path along which a variable has a val- 
ue; assignment to a variable generally starts a new 
live range, which terminates with the last use of that 
value. Two variables that do not have intersecting live 
ranges can share a register. More frequently used 
variables are given priority for register allocation. In 
this way, maximum usage can be made of the regis- 
ters. Other optimizations performed at this stage are: 

• Allocation of safe and scratch registers — By 
convention, registers R0 through R2 and F0 
through F3 are considered "scratch” registers; 
their values are not retained across procedure 
calls. Usage of these registers can reduce over- 
head of procedure calls. 

• Register Parameter Allocation — For static rou- 
tines, parameters are passed in registers whenever 
possible. 

Step-Five — Code Rewrite 

Code is rewritten in IR32 to be passed to the code 
generator. Code is reorganized where necessary to 
increase performance. 

2.5 The Code Generator 

The code generator’s input is an IR32 file; its output is 
assembly code that can be assembled by the GNX 
assembler into an object module. 

The code generator matches expression trees with 
optimal code sequences. Several “peephole” opti- 
mizations are performed by the code generator: fur- 
ther reduction of arithmetic identities, stack and frame 
alignments, and strength reductions. 

In addition, the target CPU and FPU are taken into 
consideration when code is produced. Sequences of 
code are chosen based on the characteristics of the 
target processor specified by the user. This further in- 
creases code efficiency. 


3.0 Ordering Information 

Supported Host Environments and Order Codes: 
SYS32/20: 

NSW-PAS-3-BHAF3 

SYS32/30: 

NSW-PAS-3-BHBF3 

VAX/VMS: 

NSW-PAS-3-BRVM 
VAX/ULTRIX (UNIX bsd): 

NSW-PAS-3-BRVX 
Micro VAX/VMS: 

NSW-PAS-3-BCVM 
Micro VAX/ULTRIX: 

NSW-PAS-3-BCVX 

GNX-Version 3 Assembler and Cross-development 
tools (required for use with the Optimizing Pascal 
Compiler): 

SYS32/20: NSW-ASM-3-BHAF3 (provided 

with SYS32/20 system) 

SYS32/30: NSW-ASM-3-BHBF3 (provided 

with SYS32/30 system) 
VAX/VMS: NSW-ASM-3-BRVM 

VAX/ULTRIX 

(UNIX bsd): NSW-ASM-3-BRVX 

MicroVAX/VMS: NSW-ASM-3-BCVM 

MicroVAX/ULTRIX: NSW-ASM-3-BCVX 

For further information regarding National Semicon- 
ductor’s software development tools and develop- 
ment hosts, please refer to the following datasheets: 
GNX-Version 3 Development Tools 
GNX-Version 3 C Compiler 
GNX-Version 3 FORTRAN 77 Compiler 
SYS32/20 PC-Add-ln Development Package 
SYS32/30 PC-Add-ln Development Package 
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Glossary 

In our efforts to be concise and precise, we often invent new words or acronyms to use as shorthand representations of “things” 
that require much longer names if the jargon is not used. Being humans, we then become very impressed with our ability to 
exclude those not in “the know” and another "in" group is formed. This glossary has been developed to help bridge this 
language gap. We know it will help. We hope you will use it. 

Abort — The first step of recovery when an instruction or its operand(s) is not available in main memory. An Abort is initiated by 
the Memory Management Unit (MMU) and handled by the CPU. 

Absolute Address — An address that is permanently assigned to a fixed location in main memory. In assembly code, a pattern 
of characters that identifies a fixed storage location. 

Access Time — The time interval between when a request for information is made and the instant this information is available. 
Access Class — The five Series 32000 access classes are memory read, memory write, memory read-modify-write, memory 
address, and register address. The access class informs the Series 32000 CPU how to interpret a reference to a general 
operand. Each instruction assigns an access class to each of it two operands, which in turn fully defines the action of any 
addressing mode in referencing that operand. 

Accumulator— A register which stores the result of an ALU operation. 

Ada — A high level language designed for the Department of Defense. It gives preference to full English words. It is meant to be 
the standard military language. 

Address — An expression, usually numerical, which designates a specific location in a storage or memory device. 
Address-Data Register — A register which may contain either address or data, sometimes referred to as a general-purpose 
register. 

Address Strobe — Control signal used to tell external devices when the address is valid on the external address bus. 
Address Translation — The process by which a logical address emanating from the CPU is transformed into a physical address 
to main memory. This is performed by the Memory Management Unit (MMU) in Series 32000 systems. Logical address to 
Physical address mapping is established by the operating system when it brings pages into main memory. 

Addressing Mode — The manner in which an operand is accessed. Series 32000 CPUs have nine addressing modes: Register, 
Register Relative, Memory Relative, Immediate, Absolute, External, Top-of Stack, Memory Space, and Scaled Indexing. 
Algorithm — A set of procedures to which a given result is obtained. 

Alignment — The issue of whether an instruction must begin on a byte, double byte, or quad byte address boundary. 

ALU — Arithmetic Logic Unit. A computational subsystem which performs the arithmetic and logical operations of a digital 
system. 

Array— A structured data type consisting of a number of elements, all of the same data type, such that each data element can 
be individually identified by an integer index. Arrays represent a basic storage data type used in all high-level languages. 
ASCII — (American National Standard Code for Information Interchange, 1968). This standard code uses a character set gener- 
ally coded as 7-bit characters (8-bits when using parity check). Originally defined to allow human readable information to be 
passed to a terminal, it is used for information interchange among data processing systems, communication systems, and 
associated equipment. The ASCII set consists of alphabetic, numeric, and control characters. Synonymous with USASCII. 
Assemble — To prepare a machine language program (also called machine code or object code) from a symbolic language 
program by substituting absolute operation codes for symbolic operation codes and absolute or relocatable addresses for 
symbolic addresses. Machine code is a series of ones and zeros which a computer “understands”. 

Assembler — This program changes the programmer’s source program (written in English assembly language and understand- 
able to the programmer) to the I’s and 0’s that the machine “understands”. In particular, the Assembler converts assembly 
language to machine code. This machine code output is called the OBJECT file. 

Assembly Language — A step up in the language chain. This is a set of instructions which is made up of alpha numeric 
characters which, with study, are understandable to the programmer. Different type of machines have different assembly 
languages, so the assembly language programmer must learn a different set of instructions each time s/he changes machine. 
Associative Cache — A dual storage area where each data entry has an associated “tag” entry. The tags are simultaneously 
compared to the input value (a logical address) in the case of the MMU, and if a matching tag is found, the associated data entry 
is output. An associative cache is present within the MMU in Series 32000 systems to provide logical-to-physical address 
translation. 

Asynchronous Device — A device in which the speed of operation is not related to any frequency in the system to which it is 
connected. 

BASIC — This acronym stands for Beginner's All-purpose Symbolic Instruction Code. BASIC is one of the most “English like” of 
the high level languages and is usually the first programming language learned. 

Baud Rate — Data transfer rate. For most serial transmission protocols, this is synonymous with bits-per-second (bps). 

BCD — Binary Coded Decimal. A binary numbering system for coding decimal numbers. A 4-bit grouping provides a binary value 
range from 0000 to 1001, and codes the decimal digits "0" through "9”. To count to 9 requires a single 4-bit grouping: to count 
to 99 takes two groupings of 4 bits; to count to 999 takes three groupings of 4 bits, etc. 

Benchmark — In terms of computers, this refers to a software program designed to perform some task which will demonstrate 
the relative processing speed of one computer versus another. 
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Bit — An abbreviation of “binary digit”. It is a unit of information represented by either a one or a zero. 

Bit Field— A group of bits addressable as a single entity. A bit field is fully specified by the location of its least significant bit and 
its length in bits. In Series 32000 systems, bit fields may be from one to 32 bits in length. 

Branch— A nonsequential flow in a software instruction stream. 

Breakpoint— A place in a routine specified by an instruction, instruction digit, or other condition, where the software program 
flow will be interrupted by external intervention or by a monitor routine. 

Buffer — An isolating circuit used to avoid reaction of a driven circuit on the corresponding driver circuit. Buffers also supply 
increased current drive capacity. 

Bus — A group of conductors used for transmitting signals or power. 

Bus Cycle — The time necessary to complete one transfer of information requiring the use of external address, data and control 
buses. 

Byte — Eight bits. 

Byte Enable — BEO to BE3. CPU control signals which activate memory banks, each bank providing one byte of data per 
address. 

C — A highly structured high level language developed by Bell Laboratories to optimize the size and efficiency of the program. 
This language has gained much popularity because it allows the programmer to get close to the hardware (low level) as well as 
being a high level language. Before C, the programmer who had to address the hardware had to use assembly language or 
machine code. 

Cache — See Associative Cache. 

Cache Hit— In the MMU, logical-to-physical address translation takes place via the associative cache. For this to happen, the 
addressed page must be resident in physical memory such that a logical address tag is present in the MMU’s translation cache. 
Cache Miss — When a logical address is presented to the MMU, and no physical address translation entry is found in the MMU’s 
associative cache. 

Cascaded— Stringing together of units to expand the operation of the unit. Interrupt Control Units present in a Series 32000 
system which are in addition the Master ICU are referred to as “cascaded” ICUs; i.e., interrupts cascade from a second-level 
ICU through the master ICU to the CPU. 

Clock— A device that generates a periodic signal used for synchronization. 

Clock Cycle— After making a low-to-high transition, the clock will have completed one cycle when it is about to make another 
low-to-high transition. This time is equal to 1 /f where f = the clock frequency. 

COBOL— This acronym stands for "Common Business Oriented Language”, It is a language especially good for bookkeeping 
and accounting. 

COFF-COMMON OBJECT FILE FORMAT is a standard way of constructing files developed by AT&T for the express purpose of 
making all files similar. This will help reduce the situation where large files developed by one organization won’t run on another 
organization's equipment simply because the software interfaces are different. It provides a great potential for savings in both 
time and money. 

Compile — To take a program written in a High-Level Language such as C, Pascal, or FORTRAN and convert it into an object- 
code format which can be loaded into a computer’s main memory. During compilation, symbolic HLL statements, called source 
code, are converted into one or more machine instructions which the CPU “understands”. A compiler also calls the assemble 
function. 

Compiler— The program that converts from Source to Machine Code. The conversion is from a particular high level language to 
machine code. For example, the C compiler will convert a C source program written by a programmer to machine code. This 
machine code output is in the same format as that of the assembler and is also called an OBJECT file. 

CPU — Central Processing Unit. The portion of a computer system that contains the arithmetic logic unit, register file, and other 
control oriented subsystems. It performs arithmetic operations, controls instruction processing, and provides timing signals and 
other housekeeping operations. 

Cross Support— The alternative to using a “Native” development like SYS32 to develop your programs is to use Cross Support 
software. “Native” means that the CPU in the development system is the same as the CPU in the system being developed. 
Cross support software is all of the necessary programs for development that operate on one CPU, but generate code for 
another CPU. Use of the VAX to generate Series 32000 code is a good example of cross support. 

Demand-Paged Virtual Memory— A virtual memory method in which memory is divided into blocks of equal size which are 
referred to as pages. These pages are then moved back and forth between main memory and secondary storage as required by 
the CPU. Demand paging reduces the problem of memory fragmentation which results in unused memory space. 

Dispatch Table — In Series 32000 systems, this is an area of memory which contains interrupt descriptors for all possible 
hardware interrupts and software traps. The interrupt descriptor directs the CPU to the module descriptor for the procedure 
which is designed to handle that particular interrupt. 

Displacement— A numerical offset from a known point of reference. Displacements are used in programming to facilitate 
position independent code, such that a given program can be loaded anywhere in memory. In Series 32000 processors, a 
displacement is contained in the instruction itself. 
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DMA — Direct Memory Access. A method that uses a small processor (DMA Controller) whose sole task is that of controlling 
input-output or data movement. With DMA, data is moved into or out of the system without CPU intervention once the DMA 
controller has been initialized by the CPU and activated. 

Double-Precision — With reference to 32000 floating-point arithmetic, a double-precision number has a 52-bit fraction field, 1 1- 
bit exponent field and a sign bit (64-bits total). 

Double Word — Two words, i.e., 32 bits. 

Editor — A program which allows a person to write and modify text. This program can be as complicated as the situation 
requires, from the very simple line editor to the most complicated word processor. Letters, numbers and unprintable control 
characters are stored in memory so that they can be recalled for modification or printing. The programmer uses this device to 
enter the program into the computer. At this stage, the program is recognizable to both the programmer and the computer as 
lines of English text. This English version of the program is known as the SOURCE. 

Emulate — To imitate one system with another, such that the imitating system accepts the same data, executes the same 
programs, and achieves the same results as the imitated system. 

Exception — An occurrence which must be resolved through CPU intervention. An exception results in the suspension of normal 
program flow. In Series 32000 systems, exceptions occur as a result of a hardware reset, interrupt or software traps. Execution 
of floating-point instructions may also result in occurrences which must be resolved through CPU intervention. 

Exponent — In scientific notation, a numeral that indicates the power to which the base is raised. 

EXEC2 — NSC’s Real Time Executive for Series 32000. 

FIFO — First-in first-out. A FIFO device is one from which data can be read out only in the same order as it was entered, but not 
necessarily at the same rate. 

Floating-Point— A method by which computers deal with numbers having a fractional component. In general, it pertains to a 
system in which the location of the decimal/binary point does not remain fixed with respect to one end of numerical expressions, 
but is regularly recalculated. The location of the point is usually given by expressing a power of the base. 

FORTRAN— A high level language written for the scientific community. It makes heavy use of algebraic expressions and 
arithmetic statements. 

FP— Frame Pointer. CPU register which points to a dynamically allocated data area created at the beginning of a procedure by 
the ENTER instruction. 

FPU — Floating-Point Unit is a slave processor in Series 32000 systems which implements in hardware all calculations needed to 
support floating-point arithmetic, which otherwise would have to be implemented in software. The NS32081 FPU provides high- 
speed floating point instructions for single (32-bit) and double (64-bit) precision. Supports IEEE standard for binary floating point 
arithmetic. Compatible with NS32032, NS32C032, NS32016, NS32C016 and NS32008 CPUs. 

Fragmented— The term used to describe the presence of small, unused blocks of memory. The problem is especially common 
in segmented memory systems, and results in inefficient use of memory storage. 

Frame — A block of memory on the stack that provides local storage for parameters in the current procedure. 

GENIX — The NSC version of the UNIX operating system, ported to work with the Series 32000. It also has all of the necessary 
utilities added so that program development can be accomplished. 

Hardware — Physical equipment, e.g., mechanical, magnetic, electrical, or electronic devices, as opposed to the software 
programs or method in which the hardware is used. 

High Level Languages — These are languages which are not dependent on the type of computer on which they run. A program 
written in a high level language will generally run on any computer for which there is a compiler for that language. This feature 
makes high level languages “Portable”, i.e., the same program will run on many different types of computers. A HLL requires a 
compiler or interpreter that translates each HLL statement into a series of machine language instructions for a particular 
machine. 

ICU — Interrupt Control Unit. A memory-mapped microprocessor support chip in Series 32000 systems which handles external 
interrupts as well as additional software traps. The ICU provides a vector to the CPU to identify the servicing software procedure. 
Indexing — In computers, a method of address modification that is by means of index registers. 

Index Register— A register whose contents may be added to or subtracted from the operand address. 

Indirect Addressing— Programming method where the initial address is the storage location of a word which is the actual 
address. This indirect address is the location of the data to be operated upon. 

Instruction— A statement that specifies an operation and the values or locations of its operands, i.e., it tells the CPU what to do 
and to what. 

Instruction Cycle — The period of time during which a programmed system executes a particular instruction. 

Instruction Fetch— The action of accessing the next instruction from memory, often overlapped by its partial execution. 
Instruction Queue — With Series 32000 CPUs, this is a small area of RAM organized as a FIFO buffer which stores prefetched 
instructions until the CPU is ready to execute them. 

Interpreter— A program which translates HLL statements into machine instructions at run time, i.e., while the program is 
executing, and is co-resident with the user program. 
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Interrupt — To signal the CPU to stop a software program in such a way that it can be resumed and branch to another section of 
code. Interrupts can be caused by events external or internal to the CPU, and by either software or hardware. 

INTBASE — Interrupt Base Register. In the Series 32000, a 32-bit CPU register which holds the address of the dispatch table 
containing addresses for interrupts and traps. 

ISE — In-System Emulator. A computer system which imitates the operation of another in terms of software execution. In 
microprocessor system development, the ISE takes the place of the microprocessor by means of a connector at the end of an 
umbilical cable. Not only does the ISE perform all the functions of the microprocessor, but it also allows the engineer to debug 
his system by setting breakpoints on various conditions, permits tracing of program flow, and provides substitution memory 
which may be used in place of actual target system memory. 

ISV — Independent Software Vendor. A vendor, independent from National Semiconductor, who ports or develops software for 
Series 32000 components. They in turn sell this software to our customers who are designing Series 32000 based products. 
Kernel — This is the name given to the core of the operating system. Other programs are added to the kernel to provide the 
features of the operating system. The kernel provides control and synchronization. 

Language — A set of characters and symbols and the rules for using them. In our context, it is the “English like” format of the 
instructions which are understood by both the programmer and the computer. 

Library — High level languages as well as assembly language contain many routines which are used over and over again. To 
prevent the programmer from having to write the routine every time it is needed, these routines are stored in libraries to be 
referenced each time they are needed. These libraries are also OBJECT files. 

Linear Address Space — An address space where addresses start at location zero and proceed in a linear fashion (i.e., with no 
holes or breaks) to the upper limit imposed by the total number of bits in a logical address. 

Link Base — In the Series 32000, Module Descriptor entry which points to a table in memory containing entries which reference 
variables or entry points in Modules external to the one presently executing. 

Linker— Large programs are generally broken down to component parts and farmed out to several programmers. Each one of 
these parts is called a MODULE. Each programmer will develop the module using either high level or assembly language, then 
“assemble” assembly language modules or “compile” high level language modules. A programmer tells the linker how to 
connect these modules to make the program run. The linker makes these connections, resolves all questions about data 
needed by one module, but contained in another, finds all library routines, and cleans up any other loose ends. The output from 
the linker is called BINARY file and is the file that will run on the computer. 

Logical Address Space— The range of addresses which a programmer can assign in a software program. This range is 
determined by the length of the computer’s address registers. 

LSB — Least Significant Bit. The bit in a string of bits representing the lowest value. 

Machine Code — The code that a computer recognizes. Specifies internal register files and operations that directly control the 
computer’s internal hardware. 

Machine Language— The ones and zeros which are “understood” by the machine. This is often called “Binary Code.” The 
programmer must be able to understand the bit patterns to be able to decipher the language. Each machine has a unique 
machine language. 

Main Memory — The program and data storage area in a computer system which is physically addressed by the microprocessor 
or MMU address lines. 

Mantissa — In a floating-point number, this is the fractional component. 

Mapping — The process whereby the operating system assigns physical addresses in main memory to the logical addresses 
assigned by the software. 

Memory-Mapped — Referring to peripheral hardware devices which are addressed as if they were part of the computer’s 
memory space. They are accessed in the same manner as main memory, i.e., through memory read/write operations. 
Microcode — A sequence of primitive instructions that control the internal hardware of a computer. Their execution is initiated by 
the decoding of a software instruction. Microcode is maintained in special storage and often used in place of hardwired logic. 
Microcomputer — A computer system whose Central Processing Unit is a Microprocessor. Generally refers to a board-level 
product. 

Minicomputer— A “box-level” computer with system capabilities generally between that of a microcomputer and a mainframe. 
MMU— Memory Management Unit. This is a slave processor in Series 32000 which aids in the implementation of demand-paged 
virtual memory. It provides logical to physical address translation and initiates an instruction abort to the CPU when a desired 
memory location is not in main memory. 

MOD — Mod Register. In the Series 32000, a 16-bit CPU register which holds the address of the Module Descriptor of the 
currently executing software module. 

Module — An independent subprogram that performs a specific function and is usually part of a task, i.e., part of a larger 
program. 

Module Descriptor — In the Series 32000, a set of four 32-bit entries found in main memory. Three are currently defined and 
point to the static data area, link table, and first instruction of the module it describes. The fourth is reserved. 
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Modularity— A software concept which provides a means of overcoming natural human limitations for dealing with programming 
complexity by specifying the subdivision of large and complex programming tasks into smaller and simpler subprograms, or 
modules, each of which performs some well-defined portion of the complete processing task. 

MSB — Most Significant Bit. The bit in a string of bits representing the highest value. 

NET — Short for NETWORK and describes a number of computers connected to each other via phone or high speed links. A net 
is convenient for exchanging common information in the form of ‘‘mail” as well as for data exchange. 

NMI — Nonmaskable Interrupt. A hardware interrupt which cannot be disabled by software. It is generally the highest priority 
interrupt. 

Object Code — Output from a compiler or assembler which is itself executable machine code (or is suitable for processing to 
produce executable machine code). 

Operand — In a computer, a datum which is processed by the CPU. It is referenced by the address part of an instruction. 
Operating System— A collection of integrated service routines used by the computer to control the sequence of programs. The 
operating system consists of software which controls the execution of computer programs and which may provide storage 
assignment, input/output control, scheduling, data management, accounting, debugging, editing, and related services. Their 
sophistication varies from small monitor systems, like those used on boards, to the large, complex systems used on main 
frames. 

Operating System Mode — In this mode, the CPU can execute all instructions in the instruction set, access all bits in the 
Processor Status Register, and access any memory location available to the processor. 

Operator — In the description of an instruction, it is the action to be performed on operands. 

Page Fault — A hardware generated trap used to tell the operating system to bring the missing page in from secondary storage. 
Page Swap — The exchange of a page of software in secondary storage with another page located in main memory. The 
operating system supervises this operation, which is executed by the CPU and involves external devices such as disk and DMA 
controllers. 

Page Table — A IK-byte area in main memory containing 256 entries which describe the location and attributes of all pointer 
tables, i.e., a list of pointer table addresses. 

Peripheral— A device which is part of the computer system and operates under the supervision of the CPU. Peripheral devices 
are often physically separated from the CPU. 

Pascal — A high level language designed originally to teach structured programming. It has become popular in the software 
community and has been expanded to be a versatile language in industry. 

Physical Address — The address presented to main memory, either by the CPU or MMU. 

Pointer Table — A 512-byte page located either in main memory or secondary storage containing 128 entries. Each entry 
describes an individual page of the software program. Each page of the software program may reside in main memory or in 
secondary storage. 

Pop — To read a datum from the top of a stack. 

PORT — To port an operating system is to cause that particular operating system to operate with a defined hardware package. 
GENIX is the NSC version of UNIX which has been ported to SYS32. The operating system for other Series 32000 based 
systems will differ in some degree from SYS32 and the NSC GENIX binary will not operate. It is now necessary to modify GENIX 
to fit the situation caused by the new hardware. The GENIX SOURCE is used because this is the program that is most readily 
understood by the programmer. The source is changed, compiled, and linked to get a new binary for that particular machine. 
Primitive Data Type — A data type which can be directly manipulated by the hardware. With Series 32000, these are integers, 
floating-point numbers, Booleans, BCD digits, and bit fields. 

Procedure — A subprogram which performs a particular function required by a module, i.e., by a larger program; an ordered set 
of instructions that have a general or frequent use. 

Process — A task. 

Program Base — Module Descriptor entry which points to the first instruction in the module being described. 

Program Counter — CPU register which specifies the logical address of the currently executing instruction. 

Protection— The process of restricting a software program’s access to certain portions of memory using hardware mecha- 
nisms. Typically done at the operating system and page level. 

PSR— Processor Status Register. A 16-bit register on Series 32000 CPU’s which contains bits used by the software to make 
decisions and determine program flow. 

Push— to write a datum to the top of a stack. 

Quad word— Four words, i.e., 64 bits. 

Queue — A First-In-First-Out data storage area, in which the data may be removed at a rate different from that at which it was 
stored. 

Real Time — The actual time in human terms, related to a process. In a UNIX system, real time is total elapsed time, CPU time is 
the percent of time a process is actually in the CPU. Sys time is the time spent in system mode, and user time is the time spent in 
user mode. 
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Glossary (Continued) 

Real Time Operating Systems — An operating system which operates with a known and predictable response time limit, so that 
it can control a physical event. 

Record— A structured data type with multiple elements, each of which may be of a different data type, e.g., strings, arrays, 
bytes, etc. 

Register — A temporary storage location, usually in the CPU, which holds digital data. 

Relative Address — The number that specifies the difference between the base address and the absolute address. 
Relocatable — In reference to software programs, this is code which can be loaded into any location in main memory without 
affecting the operation of the program. 

Return Address — The address to which a subroutine call, interrupt or trap subroutine will return after it is finished executing. 
Routine — A procedure. 

Royalty — Royalty is money paid to the inventor for each item of product sold. A good analogy to use is the music business. Any 
time a song is used, the songwriter is paid a royalty. Think of UNIX as a song and GENIX or SYSTEM V as special arrangements. 
For each shipment of GENIX or SYSTEM V, the customer pays a royalty to NSC who, in turn, pays a royalty to AT&T. 

SB — In the Series 32000 Static Base Register. Points to the start of the static data area for the currently executing module. 
Secondary Storage — This is generally slow-access, nonvolatile memory such as a hard-disk which is used to store the pages 
of software programs not currently needed by the CPU. 

Segmented Address Space — Term used to describe the division of allocatable memory space into blocks of segments of 
variable size. 

Setup Time — The minimum amount of time that data must be present at an input to ensure data acceptance when the device is 
clocked. 

Slave Processor — A processor which cooperates with the main microprocessor in executing certain instructions from the 
instruction stream. A slave processor generally accelerates certain functions which increases overall system throughput. Exam- 
ples of slave processors are the FPU and MMU of Series 32000. 

Software — Programs or data structures that execute instructions or cause instructions to be executed and that will cause the 
computer to do work. 

Software License — NSC does not sell software. Rather, we license the right to use our software. A software license is required 
for all Series 32000 software. We use the license to protect NSC’s interests and to assist in honoring our commitment to AT&T. 
The license is also the vehicle which we use to track customers so that updates can be issued in a timely manner. 

Software Q/A— It is the charter of the Quality Assurance people to ensure that when a software product reaches the customer 
that it is “bug” free. In the real world, it is impossible to test every combination of functions, so some bugs do get through. The 
Q/A engineer develops test programs which rigorously test the product prior to its introduction to the market place. 

SP1— In the Series 32000, User Stack Pointer. Points to the top of the User Stack and is selected for all stack operations while 
in User Mode. 

SP0 — In the Series 32000, Interrupt Stack Pointer. Points to the top of the interrupt stack. It is used by the operating system 
whenever an interrupt or trap occurs. 

Stack— A one-dimensional data structure in which values are entered and removed one datum at a time from a location called 
the Top-of-Stack. To the programmer, it appears as a block of memory and a variable called the Stack Pointer (which points to 
the top of the stack). 

Stack Pointer — CPU register which points to the top of a stack. 

Static Base Register— A 32-bit CPU register which points to the beginning of the static data area for the currently executing 
module. 

String — An array of integers, all of the same length. The integers may be bytes, words, or double words. The integers may be 
interpreted in various ways (see ASCII). 

Subroutine — A self-contained program which is part of a procedure. 

Symmetry — A computer architecture is said to be symmetrical when any instruction can specify any operand length (byte, word 
or double word) and make use of any address-data register or memory location while using any addressing mode. 
Synchronous — Refers to two or more things made to happen in a system at the same time, by means of a common clock 
signal. 

Tag— A label appended to some data entry used in a !ook-up process whereby the desired datum can be identified by its tag. 
Task— The highest-level subdivision of a user software program. The largest program entity that a computer’s hardware directly 
deals with. 

TCU — Timing Control Unit. A device used to provide system clocks, bus control signals and bus cycle extension capability for 
Series 32000. 

Trap — An internally generated interrupt request caused as a direct and immediate result of the encounter of an event. 

T-State — One clock period. If the system clock frequency is 10 MHz, one T-State will take 100 ns to complete. Operations 
internal and external to the CPU are synchronized to the beginning and middle of the T-States. There are four T-States in a 
normal Series 32000 CPU bus cycle. 
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UNIXTM — An operating system developed at Bell Laboratories in the early 1970s. Software programs that run under UNIX are 
written in the high-level language C, making them highly portable. UNIX systems do not distinguish user programs from operat- 
ing system programs in either capability or usage, and they allow users to route the output of one program directly into the input 
of another. This operating is unique and is becoming very popular in the microcomputer world. 

USENET— A net to which UNIX systems in the United States connect. Some systems in Europe and Australia also use this net 
for the purpose of passing information. 

User— A software program. The total set of tasks (instructions) that accomplish a desired result. Tasks are managed by the 
operating system. 

User Mode — Machine state in which the executing procedure has limited use of the instruction set and limited access to 
memory and the PSR. 

uucp — Software which allows UNIX computers to pass information to other UNIX systems. 

Variable — A parameter that can assume any of a given set of values. 

Vector— Byte provided by the ICU (Interrupt Control Unit) which tells the CPU where within the Descriptor table the descriptor is 
located for the interrupt it has just requested. 

Virtual Address— Address generated by the user to the available address space which is translated by the computer and 
operating system to a physical address of available memory. 

Virtual Memory— The storage space that may be regarded as addressable main storage by the system. The operating system 
maps Virtual addresses into physical (main memory) addresses. The size of virtual memory is limited by the method of memory 
management employed and by the amount of secondary storage available, not by the actual number of main storage locations, 
so that the user does not have to worry about real memory size or allocation. 

VMS— This is the operating system designed by Digital Equipment Corporation for their VAX series of computers. The original 
Series 32000 software was developed on a VAX which was being controlled by the VMS Operating System. 

Walt-State— An additional clock period added to a CPU memory cycle which gives an external memory device additional time to 
provide the CPU with data. Also used by bus arbitration circuitry to hold the CPU in an idle state until access to a shared 
resource is gained. 

Winchester — Small, hard-disk media commonly found in personal computers. 

Word— A character string or bit string considered as the primary data entity. For historical reasons, a word is a group of 16 bits 
in Series 32000 systems. 
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Westerville 

Bell Industries 

Zentronics 

(716)887-2800 

Hamilton/Avnet 

(801)255-9611 

(416) 451-9600 

Fairport 

(614) 882-7004 

Salt Lake City 

Mississauga 

Pioneer Standard 

OKLAHOMA 

Anthem Electronics 

Hamilton/Avnet 

(716)381-7070 

Tulsa 

(801)973-8555 

(416)677-7432 

Time Electronics 

Arrow Electronics 

Arrow Electronics 

Nepean 

(716) 383-8853 

(918) 252-7537 

(801)973-6913 

Hamilton/Avnet 

Hauppauge 

Hamilton/Avnet 

Hamilton/Avnet 

(613) 226-1700 

Anthem Electronics 

(918) 252-7297 

(801) 972-4300 

Zentronics 

(516) 273-1660 

Radio Inc. 

West Valley 

(613) 226-8840 

Arrow Electronics 

(918) 587-9123 

Time Electronics 

Ottawa 

(516)231-1000 

OREGON 

(801)973-8181 

Semad Electronics 

Hamilton/Avnet 

Beaverton 

WASHINGTON 

(613) 727-8325 

(516) 434-7413 

Almac-Stroum Electronics 

Bellevue 

Points Claire 

Time Electronics 

(503) 629-8090 

Almac-Stroum Electronics 

Semad Electronics 

(516) 273-0100 

Anthem Electronics 

(206) 643-9992 

(514) 694-0860 

Port Chester 

(503) 643-1114 

Bothell 

St. Laurent 

Zeus Components 

Arrow Electronics 

Anthem Electronics 

Hamilton/Avnet 

(914) 937-7400 

(503) 645-6456 

(206)483-1700 

(514)335-1000 

Rochester 

Hamilton/Avnet 

Kent 

Zentronics 

Arrow Electronics 

(503) 627-0201 

Arrow Electronics 

(514)737-9700 

(716) 427-0300 

Lake Oswego 

(206) 575-4420 

Willowdale 

Hamilton/Avnet 

Bell Industries 

Redmond 

ElectroSonic Inc. 

(716)475-9130 

Summit Electronics 
(716) 334-8110 

Ronkonkoma 

Zeus Components 
(516) 737-4500 

Syracuse 

Hamilton/Avnet 
(315) 437-2641 

Time Electronics 
(315) 432-0355 

Westbury 

Hamilton/Avnet Export Div. 
(516) 997-6868 

Woodbury 

Pioneer Electronics 
(516) 921-8700 

(503) 635-6500 

PENNSYLVANIA 

Horsham 

Anthem Electronics 
(215) 443-5150 

Pioneer Technology 
(215) 674-4000 

King of Prussia 

Time Electronics 
(215) 337-0900 

Monroeville 

Arrow Electronics 
(412) 856-7000 

Hamilton/Avnet 
(206) 881-6697 

(416) 494-1666 



SALES OFFICES 


ALABAMA 

Huntsville 
(205) 721-9367 

ARIZONA 

Tempe 

(602) 966-4563 
CALIFORNIA 
Inglewood 
(213) 645-4226 
Roseville 
(916) 786-5577 
San Diego 
(619) 587-0666 
Santa Clara 
(408) 562-5900 
Tustin 

(714) 259-8880 
Woodland Hills 
(818) 888-2602 

COLORADO 

Boulder 

(303) 440-3400 
Colorado Springs 
(303) 578-3319 
Englewood 
(303) 790-8090 
CONNECTICUT 
Hamden 
(203) 288-1560 


FLORIDA 

Boca Raton 
(407) 997-8133 
Orlando 
(305) 629-1720 
St. Petersburg 
(813) 577-1380 
GEORGIA 
Norcross 
(404) 441-2740 
ILLINOIS 
Schaumburg 
(312) 397-8777 
INDIANA 
Carmel 

(317) 843-7160 
Fort Wayne 
(219) 484-0722 

IOWA 

Cedar Rapids 
(319) 395-0090 
KANSAS 
Overland Park 
(913) 451-4402 
MARYLAND 
Hanover 
(301) 796-8900 
MASSACHUSETTS 
Burlington 
(617) 273-3170 


MICHIGAN 

Grand Rapids 
(616) 940-0588 
W. Bloomfield 
(313) 855-0166 
MINNESOTA 
Bloomington 
(612) 854-8200 
NEW JERSEY 
Paramus 
(201) 599-0955 
NEW MEXICO 
Albuquerque 
(505) 884-5601 
NEW YORK 
Fairport 

(716) 223-7700 
Liverpool 
(315) 451-9091 
Melville 

(516) 351-1000 
Wappinger Falls 
(914) 298-0680 

NORTH CAROLINA 

Cary 

(919) 481-4311 
OHIO 
Dayton 

(513) 435-6886 
Dublin 

(614) 766-3679 
Independence 
(216) 524-5577 


ONTARIO 

Mississauga 
(416) 678-2920 
Nepean 
(613) 596-0411 
OREGON 
Portland 
(503) 639-5442 
PENNSYLVANIA 
Horsham 
(215) 672-6767 
PUERTO RICO 
Rio Piedras 
(809) 758-9211 
QUEBEC 
Lachine 
(514) 636-8525 
TEXAS 
Austin 

(512) 346-3990 
Houston 
(713) 771-3547 
Richardson 
(214) 234-3811 
UTAH 

Salt Lake City 
(801) 322-4747 
WASHINGTON 
Bellevue 
(206) 453-9944 
WISCONSIN 
Brookfield 
(414) 782-1818 
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National 

Semiconductor 



National Semiconductor Corporation 

2900 Semiconductor Drive 
P.O. Box 58090 
Santa Clara, CA 95052-8090 
Tel: (408) 721-5000 
TWX: (910)339-9240 


SALES OFFICES (Continued) 


INTERNATIONAL 

OFFICES 

Electronica NSC de Mexico SA 

Juventino Rosas No 118-2 
Col Guadalupe Inn 
Mexico. 01020 D.F. Mexico 
Tel: 52-5-524-9402 
National Semicondutores 
Do Brasil Ltda. 

Av Brig Faria Lima. 1383 

6 0 AndOr-Con|, 62 

01451 Sao Paulo. SP Brasil 

Tel. (55/11)212-5066 

Fax: (55/11)211-1181 NSBR BR 

National Semiconductor GmbH 

Industriestrasse 10 

D-8080 Furstenteldbruck 

West Germany 

Tel: (0-81-41) 103-0 

Telex: 527-649 

National Semiconductor (UK) Ltd. 

The Maple. Kembrey Park 

Swindon. Wiltshire SN2 6UT 

United Kingdom 

Tel: (07-93)61-41-41 

Telex: 444-674 

Fax: (07-93)69-75-22 

National Semiconductor Benelux 

Vorstlaan 100 

B-1170 Brussels 

Belgium 

Tel: (02) 6-61-06-80 
Telex: 61007 

National Semiconductor (UK) Ltd. 

Rmgager 4A. 3 
DK-2605 Brondby 
Denmark 
Tel: (02)43-32-11 
Telex: 15-179 
Fax: (02)43-31-11 


National Semiconductor S.A. 

Centre d'A(!aires-La Boursidiere 
Bailment Champagne. B.P 90 
Route Nationale 186 
F-92357 Le Plessis Robinson 
France 

Tel (1)40-94-88-88 
Telex: 631065 
Fax. (1) 40-94-88-11 

National Semiconductor (UK) Ltd. 

Unit 2A 

Clonskeagh Square 

Clonskeagh Road 

Dublin 14 

Tel (01)69-55-89 

Telex: 91047 

Fax: (01)69-55-89 

National Semiconductor S.p.A. 

Strada 7. Palazzo R/3 

20089 Rozzano 

Milanofiori 

Italy 

Tel: (02) 8242046/7/8/9 

National Semiconductor S.p.A. 

Via del Cararaggio. 107 

00147 Rome 

Italy 

Tel: (06) 5-13-48-80 
Fax: (06) 5-13-79-47 

National Semiconductor (UK) Ltd. 

P.O. Box 29 
N-1321 Stabekk 
Norway 

Tel: (2) 12-53-70 
Fax: (2) 12-53-75 

National Semiconductor AB 

Box 2016 

Stensatravagen 1 3 
S- 12702 Skarholmen 
Sweden 

Tel: (08)970190 
Telex 10731 


National Semiconductor 

Calle Agustm de Foxa. 27 

28036 Madrid 

Spam 

Tel: (01)733-2958 
Telex: 46133 

National Semiconductor 
Switzerland 

Alte Winterthurerstrasse 53 
Posttach 567 

Ch-8304 Wallisellen-Zurich 
Switzerland 
Tel: (01) 830-2727 
Telex: 828-444 

National Semiconductor 

Kauppakartanonkatu 7 A22 
SF-00930 Helsinki 
Finland 

Tel: (90) 33-80-33 
Telex 126116 

National Semiconductor 

Postbus 90 

1380 AB Weesp 

The Netherlands 

Tel: (0-29-40) 3-04-48 

Telex 10-956 

Fax: (0-29-40) 3-04-30 

National Semiconductor Japan 

Ltd. 

Sanseido Bldg. 5F 
4-15 Nishi Shinjuku 
Shm|uku-ku 
Tokyo 160 Japan 
Tel: 3-299-7001 
Fax: 3-299-7000 


National Semiconductor 
Hong Kong Ltd. 

Suite 513. 5th Floor 
Chmachem Golden Plaza, 

77 Mody Road. Tsimshatsui East. 

Kowloon. Hong Kong 

Tel. 3-7231290 

Telex: 52996 NSSEA HX 

Fax 3-3112536 

National Semiconductor 

(Australia) PTY. Ltd. 

1st Floor. 441 St. Kilda Rd. 

Melbourne. 3004 

Victory. Australia 

Tel (03) 267-5000 

Fax: 61-3-2677458 

National Semiconductor (PTE), 

Ltd. 

200 Cantonment Road 13-01 
Southpoint" 

Singapore 0208 
Tel: 2252226 
Telex: RS 33877 

National Semiconductor (Far East) 
Ltd. 

Taiwan Branch 

P O. Box 68-332 Taipei 
7th Floor. Nan Shan Lite Bldg- 
302 Mm Chuan East Road, 

Taipei. Taiwan R.O.C 

Tel. (86) 02-501-7227 

Telex: 22837 NSTW 

Cable: NSTW TAIPEI 

National Semiconductor (Far East) 

Ltd. 

Korea Branch 

13th Floor. Dai Han Life Insurance 
63 Building. 

60. Yoido-dong. Youngdeungpo-ku. 

Seoul. Korea 150-763 

Tel: (02) 784-8051/3. 785-0696/8 

Telex: 24942 NSPKLO 

Fax: (02) 784-8054 
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